...
Aggregate Query with GROUPING SETS | Equivalent Aggregate Query with GROUP BY |
---|---|
SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b GROUPING SETS ( (a,b) ) | SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b |
SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b GROUPING SETS ( (a,b), a) | SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b
SELECT a, null, SUM( c ) FROM tab1 GROUP BY a |
SELECT a,b, SUM( c ) FROM tab1 GROUP BY a, b GROUPING SETS (a,b) | SELECT a, null, SUM( c ) FROM tab1 GROUP BY a
SELECT null, b, SUM( c ) FROM tab1 GROUP BY b |
SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b GROUPING SETS ( (a, b), a, b, ( ) ) | SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b
SELECT a, null, SUM( c ) FROM tab1 GROUP BY a, null
SELECT null, b, SUM( c ) FROM tab1 GROUP BY null, b
SELECT null, null, SUM( c ) FROM tab1 |
...
Cubes and Rollups
The general syntax is WITH CUBE ( )/ROLLUP. It is used with the GROUP BY only. CUBE creates a subtotal of all possible combinations of the set of column in its argument. Once we compute a CUBE on a set of dimension, we can get answer to all possible aggregation questions on those dimensions.
It might be also worth mentioning here that
GROUP BY CUBE( a, b, c ) WITH CUBE is equivalent to
GROUP BY a, b, c GROUPING SETS ( (a, b, c), (a, b), (b, c), (a, c), (a), (b), (c), ( )).
ROLLUP clause is used with GROUP BY to compute the aggregate at the hierarchy levels of a dimension.
ROLLUP( GROUP BY a, b, c ) with ROLLUP assumes that the hierarchy is "a" drilling down to "b" drilling down to "c".
ROLLUP (GROUP BY a, b, c) , WITH ROLLUP is equivalent to GROUP BY a, b, c GROUPING SETS ( (a, b, c), (a, b), (a), ( )).
The general syntax of ROLLUP is ROLLUP( )
hive.new.job.grouping.set.cardinality
...