Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Moving the code from Hive into a new project is not straightforward and will take some time.  The following steps are proposed:

  1. A new TLP is established.  As mentioned above, any existing Hive PMC members will be welcome to join the PMC, and any existing Hive committers will be granted committership in the new project.

  2. Hive begins the process of detangling the metastore code inside the Hive project.  This will be done inside Hive to avoid a time where the code is in both Hive and the new project that would require double patching of any new features or bugs.
    In order to enable the new project to begin adding layers around the core metastore and make releases Hive can make source only releases of only the metastore code during this interim period, similar to how the storage-api is released now.  The new project can then depend on those releases.

  3. Once the detangling is complete and Hive is satisfied that the result works, the code will be moved from Hive to the new project.

There are many technical questions of how to separate out the code.  These mainly center around which pieces of code should be moved into the new project, and whether the new project continues to depend on Hive’s storage-api (as ORC does today) or whether it copies any code that both it and Hive require (such as parts of the shim layer) in order to avoid any Hive dependencies.  Also there are places where metastore "calls back" into QL via reflection (e.g. partition expression evaluation).  We will need to determine how to continue this without pulling a dependency on all of Hive into the new project.  Discussions and decisions on this will happen throughout the process via the normal methods.

Backwards Compatibility

There are already many users of Hive metastore outside of Hive.  We do not want to break backwards compatibility for those users.  Our goal will be to make sure there is a binary compatible metastore client available for these users that will support interoperation across versions of the metastore in Hive and as a stand alone system.  Another possible approach is to assure that the Thrift interface continues to accept old clients (e.g. Hive 1.x and 2.x), rather than focusing on binary or source compatibility of of Hive client itself.

Project Name

The following have been suggested as a name for this project:

  • Flora
  • Honeycomb
  • Metastore (NOTE:  there are concerns that this would be too generic for Apache to defend the trademark and that it would not be clear enough to users that this was no long just the Hive metastore)
  • Omegastore
  • Riven
  • ZCatalog