Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: add hive.smbjoin.cache.rows, hive.mapjoin.optimized.keys, and hive.mapjoin.lazy.hashtable

...

How many values in each keys key in the map-joined table should be cached
in cached in memory.

hive.mapjoin.followby.map.aggr.hash.percentmemory

...

The number means after how many rows processed it needs to check the memory usage.

hive.smbjoin.cache.rows

How many rows with the same key value should be cached in memory per sort-merge-bucket joined table.

hive.mapjoin.optimized.keys

Whether a MapJoin hashtable should use optimized (size-wise) keys, allowing the table to take less memory. Depending on the key, memory savings for the entire table can be 5-15% or so.

hive.mapjoin.lazy.hashtable

Whether a MapJoin hashtable should deserialize values on demand. Depending on how many values in the table the join will actually touch, it can save a lot of memory by not creating objects for rows that are not needed. If all rows are needed, obviously there's no gain.

hive.optimize.skewjoin
  • Default Value: false
  • Added In: Hive 0.6.0

...