Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

This document is based on a writeup of DB2 Outer Join Behavior. The original HTML document is attached to the Hive Design Docs and can be downloaded here.

Definitions

Preserved Row table

The table in an Outer Join that must return all rows.
For left outer joins this is the Left table, for right outer joins it is the Right table, and for full outer joins both tables are Preserved Row tables.

Null Supplying table

This is the table that has nulls filled in for its columns in unmatched rows.
In the non-full outer join case, this is the other table in the Join. For full outer joins both tables are also Null Supplying tables.

During Join predicate

A predicate that is in the JOIN ON clause.
For example, in 'R1 join R2 on R1.x = 5' the predicate 'R1.x = 5' is a During Join predicate.

After Join predicate

A predicate that is in the WHERE clause.

...

Hive Implementation

Hive enforces the predicate pushdown rules by these methods in the SemanticAnalyzer and JoinPPD classes:

Rule 1: During QBJoinTree construction in Plan Gen, the parse Join Condition parseJoinCondition() logic applies this rule.
Rule 2: During JoinPPD (Join Predicate PushdownPushDown) the get Qualified Alias getQualifiedAliases() logic applies this rule.

Examples

...