Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add hive.new.job.grouping.set.cardinality (HIVE-3552)

...

Number of rows after which size of the grouping keys/aggregation classes is performed.

hive.new.job.grouping.set.cardinality
  • Default Value: 30
  • Added In: Hive 0.11.0 with HIVE-3552

Whether a new map-reduce job should be launched for grouping sets/rollups/cubes.

For a query like "select a, b, c, count(1) from T group by a, b, c with rollup;" four rows are created per row: (a, b, c), (a, b, null), (a, null, null), (null, null, null). This can lead to explosion across the map-reduce boundary if the cardinality of T is very high, and map-side aggregation does not do a very good job.

This parameter decides if Hive should add an additional map-reduce job. If the grouping set cardinality (4 in the example above) is more than this value, a new MR job is added under the assumption that the orginal "group by" will reduce the data size.

hive.mapred.local.mem
  • Default Value: 0
  • Added In:

...