Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Aggregate Query with GROUPING SETS

Equivalent Aggregate Query with GROUP BY

SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b GROUPING SETS ( (a,b) )

SELECT a, b, SUM(c) FROM tab1 GROUP BY a, b

SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b GROUPING SETS ( (a,b), a)

SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b

UNION

SELECT a, null, SUM( c ) FROM tab1 GROUP BY a

SELECT a,b, SUM( c ) FROM tab1 GROUP BY a, b GROUPING SETS (a,b)

SELECT a, null, SUM( c ) FROM tab1 GROUP BY a

UNION

SELECT null, b, SUM( c ) FROM tab1 GROUP BY b

SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b GROUPING SETS ( (a, b), a, b, ( ) )

SELECT a, b, SUM( c ) FROM tab1 GROUP BY a, b

UNION

SELECT a, null, SUM( c ) FROM tab1 GROUP BY a, null

UNION

SELECT null, b, SUM( c ) FROM tab1 GROUP BY null, b

UNION

SELECT null, null, SUM( c ) FROM tab1

...

Cubes and Rollups

The general syntax is WITH CUBE ( )/ROLLUP. It is used with the GROUP BY only. CUBE creates a subtotal of all possible combinations of the set of column in its argument. Once we compute a CUBE on a set of dimension, we can get answer to all possible aggregation questions on those dimensions.

It might be also worth mentioning here that
GROUP BY CUBE( a, b, c ) WITH CUBE is equivalent to
GROUP BY a, b, c GROUPING SETS ( (a, b, c), (a, b), (b, c), (a, c), (a), (b), (c), ( )).

ROLLUP clause is used with GROUP BY to compute the aggregate at the hierarchy levels of a dimension.
ROLLUP( GROUP BY a, b, c ) with ROLLUP assumes that the hierarchy is "a" drilling down to "b" drilling down to "c".

ROLLUP (GROUP BY a, b, c) , WITH ROLLUP is equivalent to GROUP BY a, b, c GROUPING SETS ( (a, b, c), (a, b), (a), ( )).
The general syntax of ROLLUP is ROLLUP( )

hive.new.job.grouping.set.cardinality

...