Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: adding config params

...

This is the approach we expect to take. One further modification this will require from the current export semantics, is that currently, export exports only 1 _metadata file per table, which contains the list of all the partitions inside it in the _metadata file itself. Instead, now, we propose to split that up so that the _metadata level at an object level will contain only metadata for that object. Thus, _metadata at a table level will contain only the table object, and the individual directories inside it will contain all the required partitions, and each of those dirs will have a partition level _metadata.

 

Setup/Configuration

The following parameters need to be setup in source cluster -

hive.metastore.transactional.event.listeners = org.apache.hive.hcatalog.listener.DbNotificationListener

hive.metastore.dml.events = true


Other repl v2 related parameters (with their default values). The defaults should work for these in most cases - 

REPLDIR("hive.repl.rootdir","/user/hive/repl/", "HDFS root dir for all replication dumps."),
REPLCMENABLED("hive.repl.cm.enabled", false, "Turn on ChangeManager, so delete files will go to cmrootdir."),
REPLCMDIR("hive.repl.cmrootdir","/user/hive/cmroot/", "Root dir for ChangeManager, used for deleted files."),
REPLCMRETIAN("hive.repl.cm.retain","24h", new TimeValidator(TimeUnit.HOURS),"Time to retain removed files in cmrootdir."),
REPLCMINTERVAL("hive.repl.cm.interval","3600s",new TimeValidator(TimeUnit.SECONDS),"Inteval for cmroot cleanup thread."),