Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The view would be a simple select * using Y.T syntax. It’s a degenerate case of view.
  • We would need a registry of all views which import tables/partitions from other databases for namespace accounting. This requires adding metadata to these views to distinguish them from other user-created views.
  • It would be harder to single instance imports using views (same table/partitions imported twice into the same namespace). Views are too opaque.
    Using partitioned views:
    By definition, there isn't a one-one mapping between a view partition and a table partition. In fact, hive today does not even know about this dependency between view partitions and table partitions. Partitioned views is just a metadata concept - it is not something that the query layer understands. For e.g: if a view V partitioned on ds had 2 partitions: 1 and 2, then a query like select … from V where ds = 3 may still give valid results if the ds=3 is satisfied by the table underlying V. This means that:
  • View metadata doesn’t stay in sync with source partitions (as partitions get added and dropped). The user has to explicitly do this, which won't work for our case.
  • We would like to differentiate the set of partitions that are available for the same imported tables across namespaces. This would require partition pruning based on the view partitions in query rewrite which is not how it works today.

The above notes make it clear that what we are trying to build is a very special case of a degenerate view, and it would be cleaner to introduce a new concept in Hive to model these ‘imports’.