...
Configuration key | Values | Notes |
hive.txn.manager | Default: org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager Value to turn on transactions: org.apache.hadoop.hive.ql.lockmgr.DbTxnManager | DummyTxnManager replicates pre Hive-0.13 behavior and provides no transactions. |
hive.txn.timeout | Default: 300 | Time after which transactions are declared aborted if the client has not sent a heartbeat, in seconds. |
hive.txn.max.open.batch | Default: 1000 | Maximum number of transactions that can be fetched in one call to open_txns().* |
hive.compactor.initiator.on | Default: false Value to turn on transactions: true (for exactly one instance of the Thrift metastore service) | Whether to run the initiator and cleaner threads on this metastore instance.
|
hive.compactor.worker.threads | Default: 0 Value to turn on transactions: > 0 on at least one instance of the Thrift metastore service | How many worker threads to run on this metastore instance.** |
hive.compactor.worker.timeout | Default: 86400 | Time in seconds after which a compaction job will be declared failed and the compaction re-queued. |
hive.compactor.check.interval | Default: 300 | Time in seconds between checks to see if any partitions need to be compacted.*** |
hive.compactor.delta.num.threshold | Default: 10 | Number of delta directories in a partition that will trigger a minor compaction. |
hive.compactor.delta.pct.threshold | Default: 0.1 | Fractional size of the deltas relative to the base that will trigger a major compaction. 1 = 100%, so the default 0.1 = 10%. |
hive.compactor.abortedtxn.threshold | Default: 1000 | Number of aborted transactions on a given partition that will trigger a major compaction. |
*hive.txn.max.open.batch controls how many transactions streaming agents such as Flume or Storm open simultaneously. The streaming agent then writes that number of entries into a single file (per Flume agent or Storm bolt). Thus increasing this value decreases the number of files created by streaming agents. But it also increases the number of open transactions that Hive has to track, which may negatively affect read performance.
...