Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Hive Replication builds on the MetaStore event and ExIm features to provide a framework for replicating Hive metadata and data changes between clusters. There is no requirement for the source cluster and replica to run the same Hadoop distribution, Hive version, or MetaStore RDBMS. The replication system has a fairly 'light touch', exhibiting a low degree of coupling and using the hive-metastore Thrift service as an integration point. However, the current implementation is not an 'out of the box' solution. In particular it is necessary to provide some kind of orchestration service that is responsible for requesting replication tasks and executing them.

Potential uses

  • Disaster recovery clusters.
  • Copying data into clouds for off-premise processing.

Prerequisites

  • You must be running Hive 1.1.0 or later at your replication source (for DbNotificationListener support).
  • You must be running Hive 0.8.0 or later at your replication destination (for IMPORT support).
  • You'll require Hive 1.2.0 or later JAR dependencies to instantiate and execute ReplicationTasks. This is not a cluster requirement, you'll need this only for the service orchestrating the replication.
  • You will initially require administration privileges on the source cluster to enable the writing of notifications to the MetaStore database.

...