Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: doc Tez parameters from HIVE-7158

...

hive.exec.reducers.bytes.per.reducer
  • Default Value: 1000000000 1,000,000,000 prior to Hive 0.14.0; 256 MB (256,000,000) in Hive 0.14.0 and later
  • Added In: Hive 0.2.0; default changed in 0.14.0 with HIVE-7158 (and HIVE-7917)

Size per reducer. The default is 1Gin Hive 0.14.0 and earlier is 1 GB, that is, if the input size is 10G 10 GB then 10 reducers will be used. In Hive 0.14.0 and later the default is 256 MB, that is, if the input size is 1 GB then 4 reducers will be used.

hive.exec.reducers.max
  • Default Value: 999 prior to Hive 0.14.0; 1009 in Hive 0.14.0 and later
  • Added In: Hive 0.2.0; default changed in 0.14.0 with HIVE-7158 (and HIVE-7917)

Maximum Max number of reducers that will be used. If the one specified in the configuration property mapred.reduce.tasks is negative, Hive will use this one as the max maximum number of reducers when automatically determine determining the number of reducers.

hive.exec.scratchdir

...

By default Tez will ask for however many CPUs MapReduce is configured to use per container. This can be used to overwrite the default.

hive.tez.auto.reducer.parallelism
  • Default Value: false
  • Added In: Hive 0.14.0 with HIVE-7158

Turn on Tez' auto reducer parallelism feature. When enabled, Hive will still estimate data sizes and set parallelism estimates. Tez will sample source vertices' output sizes and adjust the estimates at runtime as necessary.

hive.tez.max.partition.factor
  • Default Value: 2
  • Added In: Hive 0.14.0 with HIVE-7158

When auto reducer parallelism is enabled this factor will be used to over-partition data in shuffle edges.

hive.tez.min.partition.factor
  • Default Value: 0.25
  • Added In: Hive 0.14.0 with HIVE-7158

When auto reducer parallelism is enabled this factor will be used to put a lower limit to the number of reducers that Tez specifies.

Transactions and Compactor

...