Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add hashtable and mapjoin parameters for HIVE-1642 & HIVE-7616

...

This number means how much memory the local task can take to hold the key/value into an in-memory hash table; . If the local task's memory usage is more than this number, the local task will be aborted. It means the data of small table is too large to be held in memory.

...

  • Default Value: 0.55
  • Added In: Hive 0.7.0 with HIVE-1830

This number means how much memory the local task can take to hold the key/value into an in-memory hash table table when this map join is followed by a group by; . If the local task's memory usage is more than this number, the local task will be abortedabort by itself. It means the data of the small table is too large to be held in the memory.

hive.mapjoin.check.memory.rows

...

Whether a MapJoin hashtable should deserialize values on demand. Depending on how many values in the table the join will actually touch, it can save a lot of memory by not creating objects for rows that are not needed. If all rows are needed, obviously there's no gain.

hive.hashtable.initialCapacity
  • Default Value: 100000
  • Added In: Hive 0.7.0 with HIVE-1642

Initial capacity of mapjoin hashtable if statistics are absent, or if hive.hashtable.key.count.adjustment is set to 0.

hive.hashtable.key.count.adjustment
  • Default Value: 1.0
  • Added In: Hive 0.14.0 with HIVE-7616

Adjustment to mapjoin hashtable size derived from table and column statistics; the estimate of the number of keys is divided by this value. If the value is 0, statistics are not used and hive.hashtable.initialCapacity is used instead.

hive.hashtable.loadfactor
  • Default Value: 0.75
  • Added In: Hive 0.7.0 with HIVE-1642

In the process of Mapjoin, the key/value will be held in the hashtable. This value means the load factor for the in-memory hashtable.

hive.debug.localtask
  • Default Value: false
  • Added In: Hive 0.7.0 with HIVE-1642
hive.optimize.skewjoin
  • Default Value: false
  • Added In: Hive 0.6.0

...

hive.auto.convert.join
  • Default Value: false in 0.7.0 to 0.10.0; true in 0.11.0 and later (HIVE-3297)  
  • Added In: 0.107.0 with HIVE-1642

Whether Hive enables the optimization about converting common join into mapjoin based on the input file size. (Note that hive-default.xml.template incorrectly gives the default as false in Hive 0.11.0 through 0.13.1.)

...