THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
LEFT SEMI JOIN implements the uncorrelated IN/EXISTS subquery semantics in an efficient way. As of Hive 0.13 the IN/NOT IN/EXISTS/NOT EXISTS operators are supported using subqueries so most of these JOINs don't have to be performed manually anymore. The restrictions of using LEFT SEMI JOIN is are that the right-hand-side table should only be referenced in the join condition (ON-clause), but not in WHERE- or SELECT-clauses etc.
Code Block SELECT a.key, a.value FROM a WHERE a.key in (SELECT b.key FROM B);
can be rewritten to:
Code Block SELECT a.key, a.val FROM a LEFT SEMI JOIN b ON (a.key = b.key)
...