Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: update with links to DDL doc

...

The assumption is that B has few rows with keys which are skewed in A. So these rows can be loaded into the memory.

Hive Enhancements

Hive needs to be extended to support the following:

    • create table <T> (schema) skewed by (keys) with skew (values);
    • alter table <T> (schema) skewed by (keys) with skew (values);

e.g.,

...

Original plan:  The skew data will be obtained from list bucketing (see the List Bucketing design document). There will be no additions to the Hive grammar.

Implementation:  Starting in Hive 0.10.0, tables can be created as skewed or altered to be skewed (in which case partitions created after the ALTER statement will be skewed). In addition, skewed tables can use the list bucketing feature by specifying the STORED AS DIRECTORIES option. See the DDL documentation for details: Create Table, Skewed Tables, and Alter Table Skewed or Stored as Directories.