Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Purpose and Prerequisites

 

The purpose of this document is to summarize the findings of all the research of different joins and describe a unified design to attack the problem in Spark.  It will identify the optimization processors will be involved and their responsibilities.

...

It is not the purpose to go in depth for implementations of the various joins, such as the common-join (HIVE-7384), or the optimized join variants like mapjoin (HIVE-7613), skew-join (HIVE-8406) or SMB mapjoin (HIVE-8202).  It will be helpful to refer to the design documents attached on JIRA for those details before reading this document.  It will also be helpful to read the overall Hive on Spark design doc before reading this document.

...