Page History

...

Configuration key	Values	Notes
hive.txn.manager	Default: org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager Value to turn on transactions: org.apache.hadoop.hive.ql.lockmgr.DbTxnManager	DummyTxnManager replicates pre Hive-0.13 behavior and provides no transactions.
hive.txn.timeout	Default: 300	Time after which transactions are declared aborted if the client has not sent a heartbeat, in seconds.
hive.txn.max.open.batch	Default: 1000	Maximum number of transactions that can be fetched in one call to open_txns().*
hive.compactor.initiator.on	Default: false Value to turn on transactions: true (for exactly one instance of the Thrift metastore service)	Whether to run the initiator and cleaner threads on this metastore instance.
hive.compactor.worker.threads	Default: 0 Value to turn on transactions: > 0 on at least one instance of the Thrift metastore service	How many compactor worker threads to run on this metastore instance.**
hive.compactor.worker.timeout	Default: 86400	Time in seconds after which a compaction job will be declared failed and the compaction re-queued.
hive.compactor.check.interval	Default: 300	Time in seconds between checks to see if any tables or partitions need to be compacted.***
hive.compactor.delta.num.threshold	Default: 10	Number of delta directories in a table or partition that will trigger a minor compaction.
hive.compactor.delta.pct.threshold	Default: 0.1	Fractional Percentage (fractional) size of the deltas delta files relative to the base that will trigger a major compaction. 1 = 100%, so the default 0.1 = 10%.
hive.compactor.abortedtxn.threshold	Default: 1000	Number of aborted transactions on involving a given table or partition that will trigger a major compaction.

*hive.txn.max.open.batch controls how many transactions streaming agents such as Flume or Storm open simultaneously. The streaming agent then writes that number of entries into a single file (per Flume agent or Storm bolt). Thus increasing this value decreases the number of delta files created by streaming agents. But it also increases the number of open transactions that Hive has to track at any given time, which may negatively affect read performance.

**Worker threads spawn MapReduce jobs to do compactions. They do not do the compactions themselves. Increasing the number of worker threads will decrease the time it takes tables or partitions to be compacted once they are determined to need compaction. It will also increase the background load on the Hadoop cluster as more MapReduce jobs will be running in the background.

...

Space shortcuts

Child pages

Versions Compared

Old Version 8

New Version 9

Key