Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add hive.query.result.fileformat (HIVE-1598); add some versions; add hive.multigroupby.singlereducer (HIVE-2621)

...

...

  • Default Value: -1
  • Added In: Hive 0.1.0

The default number of reduce tasks per job. Typically set to a prime close to the number of available hosts. Ignored when mapred.job.tracker is "local". Hadoop set this to 1 by default, whereas Hive uses -1 as its default value. By setting this property to -1, Hive will automatically figure out what should be the number of reducers.

...

  • Default Value: 1000000000
  • Added In: Hive 0.2.0

Size per reducer. The default is 1G, that is, if the input size is 10G then 10 reducers will be used.

...

  • Default Value: 999
  • Added In: Hive 0.2.0

Max number of reducers will be used. If the one specified in the configuration property mapred.reduce.tasks is negative, Hive will use this one as the max number of reducers when automatically determine number of reducers.

...

  • Default Value: /tmp/hive-${user.name}
  • Added In: Hive 0.2.0

Scratch space for Hive jobs.

...

  • Default Value: TextFile
  • Added In: Hive 0.2.0

Default file format for CREATE TABLE statement. Options are TextFile, SequenceFile, RCfile, and ORC. Users can explicitly say CREATE TABLE ... STORED AS TEXTFILE|SEQUENCEFILE|RCFILE|ORC to override.

...

  • Default Value: true
  • Added In: Hive 0.5.0

Whether to check file format or not when loading data files.

hive.query.result.fileformat
  • Default Value: TextFile
  • Added In: Hive 0.7.0 with HIVE-1598

File format to use for a query's intermediate results. Options are TextFile, SequenceFile, and RCfile. Set to SequenceFile if any columns are string type and contain new-line characters (HIVE-1608, HIVE-3065).

hive.orc.splits.include.file.footer

...

hive.map.aggr
  • Default Value: true in Hive 0.3 and later; false in Hive 0.2
  • Added In: Hive 0.2.0Added In:

Whether to use map-side aggregation in Hive Group By queries.

...

  • Default Value: false
  • Added In: Hive 0.3.0

Whether there is skew in data to optimize group by queries.

...

  • Default Value: 100000
  • Added In: Hive 0.3.0

Number of rows after which size of the grouping keys/aggregation classes is performed.

...

hive.mapred.local.mem
  • Default Value: 0
  • Added In: Hive 0.3.0

For local mode, memory of the mappers/reducers.

...

  • Default Value: 0.3
  • Added In: Hive 0.7.0

Portion of total memory to be used by map-side group aggregation hash table, when this group by is followed by map join.

...

  • Default Value: 0.9
  • Added In: Hive 0.7.0

The max memory to be used by map-side group aggregation hash table, if the memory usage is higher than this number, force to flush data.

...

  • Default Value: 0.5
  • Added In: Hive 0.2.0

Portion of total memory to be used by map-side group aggregation hash table.

...

  • Default Value: 0.5
  • Added In: Hive 0.4.0

Hash aggregation will be turned off if the ratio between hash table size and input rows is bigger than this number. Set to 1 to make sure hash aggregation is never turned off.

...

  • Default Value: true
  • Added In: Hive 0.5.0

Whether to enable the bucketed group by from bucketed partitions/tables.

...

Whether to optimize multi group by query to generate a single M/R job plan. If the multi group by query has common group by keys, it will be optimized to generate a single M/R job.

hive.multigroupby.singlereducer
  • Default Value: true
  • Added In: Hive 0.9.0 with HIVE-2621

Whether to optimize multi group by query to generate a single M/R  job plan. If the multi group by query has common group by keys, it will be optimized to generate a single M/R job.

hive.optimize.cp
  • Default Value: true
  • Added In:

...