Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Configuration key

Values

Notes

hive.support.concurrency

Default: false

Value to turn on transactions: true

 

hive.txn.manager 

Default: org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager

Value to turn on transactions: org.apache.hadoop.hive.ql.lockmgr.DbTxnManager

DummyTxnManager replicates pre Hive-0.13 behavior and provides no transactions.

hive.txn.timeout 

Default: 300

Time after which transactions are declared aborted if the client has not sent a heartbeat, in seconds.

hive.txn.max.open.batch

Default: 1000

Maximum number of transactions that can be fetched in one call to open_txns().*

hive.compactor.initiator.on

Default: false

Value to turn on transactions: true (for exactly one instance of the Thrift metastore service)

Whether to run the initiator and cleaner threads on this metastore instance.

 

hive.compactor.worker.threads

Default: 0

Value to turn on transactions: > 0 on at least one instance of the Thrift metastore service

How many compactor worker threads to run on this metastore instance.**

hive.compactor.worker.timeout

Default: 86400

Time in seconds after which a compaction job will be declared failed and the compaction re-queued.

hive.compactor.check.interval

Default: 300

Time in seconds between checks to see if any tables or partitions need to be compacted.***

hive.compactor.delta.num.threshold

Default: 10

Number of delta directories in a table or partition that will trigger a minor compaction.

hive.compactor.delta.pct.threshold

Default: 0.1

Percentage (fractional) size of the delta files relative to the base that will trigger a major compaction. 1 = 100%, so the default 0.1 = 10%.

hive.compactor.abortedtxn.threshold

Default: 1000

Number of aborted transactions involving a given table or partition that will trigger a major compaction.

...

If the data in your system is not owned by the Hive user (i.e., the user that the Hive metastore runs as), then Hive will need permission to run as the user who owns the data in order to perform compactions.  If you have already set up HiveServer2 to impersonate users, then the only additional work to do is assure that Hive has the right to impersonate users from the host running the Hive metastore.  This is done by adding the hostname to hadoop.proxyuser.hive.hosts in Hadoop's core-site.xml file.  If you have not already done this, then you will need to configure Hive to act as a proxy user.  This requires you to set up keytabs for the user running the Hive metastore and add hadoop.proxyuser.hive.hosts and hadoop.proxyuser.hive.groups to Hadoop's core-site.xml file.  See the Hadoop documentation on secure mode for your version of Hadoop (e.g., for Hadoop 2.5.1 it is at Hadoop in Secure Mode).

Table Properties

If a table is to be used in ACID writes (insert, update, delete), then the table property TRANSACTIONAL must be set on that table.  Without this value, inserts will be done in the old style; updates and deletes will be prohibited.

If a table owner does not wish the system to automatically determine when to compact, then the table property NO_AUTO_COMPACTION can be set.  This will prevent all automatic compactions.  Manual compactions can still be done with Alter Table/Partition Compact statements.

...