THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- Consider the possibility of multiple mapjoins on different keys:
The above query is not supported. Without the mapjoin hint, the above query would be executed as 2 map-only jobs. If the user knows in advance that the inputs are small enough to fit in memory, the following configurable parameters can be used to make sure that the query executes in a single map-reduce job.Code Block select /*+MAPJOIN(smallTableTwo)*/ idOne, idTwo, value FROM ( select /*+MAPJOIN(smallTableOne)*/ idOne, idTwo, value FROM bigTable JOIN smallTableOne on (bigTable.idOne = smallTableOne.idOne) ) firstjoin JOIN smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo)
- hive.auto.convert.join.noconditionaltask - Whether Hive enable the optimization about converting common join into mapjoin based on the input file size. If this paramater is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than the specified size, the join is directly converted to a mapjoin (there is no conditional task).
- hive.auto.convert.join.noconditionaltask.size - If hive.auto.convert.join.noconditionaltask is off, this parameter does not take affect. However, if it is on, and the sum of size for n-1 of the tables/partitions for a n-way join is smaller than this size, the join is directly converted to a mapjoin(there is no conditional task). The default is 10MB.
Join Optimization
...
Predicate Pushdown in Outer Joins
See Hive Outer Join Behavior for information about predicate pushdown in outer joins.
Enhancements in Hive Version 0.11
...
See Join Optimization for information about enhancements to join optimization introduced in Hive version 0.11.0. The use of hints is de-emphasized in the enhanced optimizations (HIVE-3784 and related JIRAs).