Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: HIVE-21762: Changes in REPL commands to support tables include/exclude list.

...

Similar to case(a), but sets up db-level replication that excludes table/view 'Q4' and all table/view names that have prefix 'T' and numeric suffix of any length. For example, 'T3', 'T400', 't255' etc. The table/view names are case-insensitive in nature and hence table/view name with prefix 't' would also be excluded from dump.

...

This is an example of changing the replication policy/scope dynamically during incremental replication cycle.

In first case, a full DB replication policy "sales" is changed to a replication policy that includes only table/view names with only alphabets "sales.['[a-z]+']" such as "stores", "products" etc. The REPL LOAD using this dump would intelligently drops the tables which are excluded as per the new policy. For instance, table with name 'T5' would be automatically dropped during REPL LOAD if it is already there in target cluster.

...

This causes a REPL DUMP present in <dirname> (which is to be a fully qualified HDFS URL) to be pulled and loaded. If <dbname> is specified, and the original dump was a database-level dump, this allows Hive to do db-rename-mapping on import. If <dbname>.<tablename> was specified, and the original dump was a table-level dump, then this allows us to do a table-rename-mapping on import. If neither dbname nor tablename is not specified, the original dbname and tablename are used, as recorded in the dump would be used.

The REPL LOAD command has an optional WITH clause to set command-specific configurations to be used when trying to copy from the source cluster. These configurations are only used by the corresponding REPL LOAD command and won't be used for other queries running in the same session.

...

REPL STATUS

REPL STATUS <dbname>[.<tablename>];


Will return the same output that REPL LOAD returns, allows REPL LOAD to be run asynchronously. If no knowledge of a replication associated with that db / db.tbl is present, i.e., there are no known replications for that, we return an empty set. Note that for cases where a destination db or table exists, but no known repl exists for it, this should be considered an error condition for tools calling REPL LOAD to pass on to the end-user, to alert them that they may be overwriting an existing db /table with another.

Bootstrap, Revisited

...