...
In commit_txn request, we will add an optional boolean field writemanaged, which indicate the query modifies managed tables in the cache. If this is false, HMS won’t fetch writeid for the transaction from db. Note this is a performance optimization which does not impact the correctness.
hive_metastore.thriftOld API | New API |
commitTxn(long txnid) | commitTxn(long txnid, boolean writemanaged) |
RawStore
ObjectStore will use the additional validWriteIdList field for all read methods to compare with cached ValidWriteIdList
...
- Generate new write id for every write operation involving managed tables. Since DbTxnManager cache write id for every transaction, so every query will generate at most one new write id for a single table, even if it consists of multiple Hive.java write API calls
- Retrieve table write id from config for every read operation if exists (for managed table, it guarantees to be there in config), and pass the write id to HMS API
Changes in Other Projects
All other components invoking HMS API directly (bypass Hive.java) will be changed to invoke the newer HMS API. This includes HCatalog, Hive streaming, etc, and other projects using HMS client such as Impala.
...