Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In commit_txn request, we will add an optional boolean field writeManaged, which indicate the query modifies managed tables in the cache. If this is false, HMS won’t fetch writeid for the transaction from db. Note this is a performance optimization which does not impact the correctness.need to pass a ValidTxnWriteIdList, which is a list of writeid of all modified tables. HMS will use it to tag the cache entries.

hive_metastore.thriftOld API

New API

commitTxn(long txnid)

commitTxn(long txnid,boolean writeManagedstring validWriteIdList)

RawStore

ObjectStore will use the additional validWriteIdList field for all read methods to compare with cached ValidWriteIdList

...

  1. Generate new write id for every write operation involving managed tables. Since DbTxnManager cache write id for every transaction, so every query will generate at most one new write id for a single table, even if it consists of multiple Hive.java write API calls
  2. Retrieve table write id from config for every read operation if exists (for managed table, it guarantees to be there in config), and pass the write id to HMS API
  3. Pass writeid of all modified tables when commit the transaction

Changes in Other Components

...

For every read request involving table/partitions, HMS client (HiveMetaStoreClient) need to pass a validWriteIdList string in addition to the existing arguments. validWriteIdList can be null if it is external table, as HMS will return whatever in the cache for external table using eventual consistency. But if validWriteIdList=null for managed table, HMS will throw exception. validWriteIdList is a serialized form of ValidReaderWriteIdList. Usually ValidReaderWriteIdList ValidTxnWriteIdList can be obtained from HiveTxnManager using the following code snippet:

Code Block
languagejava
ValidTxnList txnIds = txnMgr.getValidTxns(); // get global transaction state
ValidTxnWriteIdList txnWriteIds = txnMgr.getValidWriteIds(txnTables, txnString); // map global transaction state to table specific write id
ValidWriteIdList writeids = txnWriteIds.getTableValidWriteIdList(fullTableName); // get table specific writeid

For every managed table write, advance the writeid for the table:

Code Block
languagejava
AcidUtils.advanceWriteId(conf, tbl);

When commit the transaction, pass writeid for all modified tables as a serialized string of ValidTxnWriteIdListOptionally, HMS client (HiveMetaStoreClient) can set writeManaged flag in commit transaction request (commitTxn) if this transaction modifies any managed table/partition. This will save a db fetch for HMS for readonly query or DDL/DML for external tables. If this set to true wrongly (eg, readonly query claim it modifies managed table), there will be a performance penalty processing commit message. If this set to false wrongly (eg, DDL on managed table claim it does not touch managed table), the entry in the cache will not mark available thus every read has to go to db. In both scenarios, there is no correctness issue.