...
- Default Value:
100
- Added In: Hive 0.5.0 (replaced by hive.smbjoin.cache.rows in Hive 0.12.0)
How many values in each keys key in the map-joined table should be cached
in cached in memory.
hive.mapjoin.followby.map.aggr.hash.percentmemory
...
The number means after how many rows processed it needs to check the memory usage.
hive.smbjoin.cache.rows
- Default Value:
10000
- Added In: Hive 0.12.0 with HIVE-4440 (replaces hive.mapjoin.bucket.cache.size)
How many rows with the same key value should be cached in memory per sort-merge-bucket joined table.
hive.mapjoin.optimized.keys
Whether a MapJoin hashtable should use optimized (size-wise) keys, allowing the table to take less memory. Depending on the key, memory savings for the entire table can be 5-15% or so.
hive.mapjoin.lazy.hashtable
Whether a MapJoin hashtable should deserialize values on demand. Depending on how many values in the table the join will actually touch, it can save a lot of memory by not creating objects for rows that are not needed. If all rows are needed, obviously there's no gain.
hive.optimize.skewjoin
- Default Value:
false
- Added In: Hive 0.6.0
...