Page History

...

Note that for versions of Hive which don't include HIVE-287, you'll need to use COUNT(1) in place of COUNT(*).

In order to count the number of distinct users by gender one could write the following query:

...

Multiple aggregations can be done at the same time, however, no two aggregations can have different DISTINCT columns. e.g while For example, the following is possible because count(DISTINCT) and sum(DISTINCT) specify the same column:

Code Block
INSERT OVERWRITE TABLE pv_gender_agg SELECT pv_users.gender, count(DISTINCT pv_users.userid), count(*), sum(DISTINCT pv_users.userid) FROM pv_users GROUP BY pv_users.gender;

Note that for versions of Hive which don't include HIVE-287, you'll need to use COUNT(1) in place of COUNT(*).

However, the following query is not allowed. We don't allow multiple DISTINCT expressions in the same query.

...

Note that for versions of Hive which don't include HIVE-287, you'll need to use COUNT(1) in place of COUNT(*).

Grouping Sets, Cubes, Rollups, and the GROUPING__ID Function

Info

title	Version

Grouping sets, CUBE and ROLLUP operators, and the GROUPING__ID function were added in Hive release 0.10.0.

See Enhanced Aggregation, Cube, Grouping and Rollup for information about these aggregation operators.

Also see the JIRAs:

HIVE-2397 Support with rollup option for group by
HIVE-3433 Implement CUBE and ROLLUP operators in Hive
HIVE-3471 Implement grouping sets in Hive
HIVE-3613 Implement grouping_id function

New in Hive release 0.11.0:

HIVE-3552 HIVE-3552 performant manner for performing cubes/rollups/grouping sets for a high number of grouping set keys

Space shortcuts

Child pages

Versions Compared

Old Version 5

New Version 6

Key

Grouping Sets, Cubes, Rollups, and the GROUPING__ID Function