Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • There will be a single hive instance. possibly spanning multiple clusters (both dfs and mr)
  • There will be a single hive metastore to keep track of the table/partition locations across different clusters.
  • There will be a default cluster for the session. Commands will be added to change the cluster.
    • Use cluster <ClusterName>
  • A table/partition can exist in more than one cluster. However, a single table will have a primary cluster, and can have multiple
    secondary clusters.
    • All the data for a table is available in the primary cluster.
    • The user can only update the table in the primary cluster. is only A table T1's primary cluster is C1 meaning :1) C1 contains
      all data that is available in all other clusters. 2) write is only
      allowed in this cluster for table C1. but need to allow exceptions
      here 3) new partitions are only allowed to be created in C1. 4) all
      data changes to T1 happened in the primary cluster should be
      replicated to other clusters if there are any secondary clusters. but
      there should be a conf to disable it as there are some exception
      situations.

...