Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The dependent table does not have a location.
    • The list of partitions are computed at query time - think of it like a view, where each partition has its own definition limited to 'select * from T where partial/full partition spec'. Query layer needs to change. Is it possible ? Unlike a view, it does not rewritten at semantic analysis time. After partition pruning is done (on a dependent table), rewrite the
      tree to contain the base table T - the columns remain the same, so it should be possible.

With this, it is possible that the partitions point to different tables.
For eg:

alter table Tdependent add partition (ds='1') depends on table T1 partition (ds='1');
alter table Tdependent add partition (ds='2') depends on table T2 partition (ds='2');

Something that can be achieved by external tables currently.

I am leaning towards this - the user need not specify both the location and the dependent partitions.
Can the external tables be enhanced to support this ?

    • The list of dependent partitions are materialized and stored in the metastore, and use that for querying.
      A query like 'select .. from Tdependent where ds = 1' gets transformed to 'select .. from (select * from T where ((ds = 1 and hr = 1) or (ds = 1 and hr = 2) .... or (ds=1 and hr=24))'
      Can put a lot of load on the query layer.