Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The names describe their uses. This is especially useful for the fact-fact join (query 82 in the TPC DS benchmark).

SMB Join across tables with different keys

If the tables have differing number of keys for example Table A has two SORT columns and Table B has 1 SORT column, then you might get an index out of bounds exception.

The following query results in an index out of bounds exception because emp_person let us say for example has one sort column while emp_pay_history has 2 sort columns.

Code Block
languagesql
titleError Hive 0.11
SELECT p.*, py.*
FROM emp_person p INNER JOIN emp_pay_history py
ON   p.empid = py.empid

This works fine.

Code Block
languagesql
titleWorking query Hive 0.11
SELECT p.*, py.*
FROM emp_pay_history py INNER JOIN emp_person p
ON   p.empid = py.empid
WHERE p.etl_process_run_id

 

Generate Hash Tables on the Task Side

...