...
Configuration key | Values | Notes |
hive.support.concurrency | Default: false Value to turn on transactions: true | |
hive.txn.manager | Default: org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager Value to turn on transactions: org.apache.hadoop.hive.ql.lockmgr.DbTxnManager | DummyTxnManager replicates pre Hive-0.13 behavior and provides no transactions. |
hive.txn.timeout | Default: 300 | Time after which transactions are declared aborted if the client has not sent a heartbeat, in seconds. |
hive.txn.max.open.batch | Default: 1000 | Maximum number of transactions that can be fetched in one call to open_txns().* |
hive.compactor.initiator.on | Default: false Value to turn on transactions: true (for exactly one instance of the Thrift metastore service) | Whether to run the initiator and cleaner threads on this metastore instance.
|
hive.compactor.worker.threads | Default: 0 Value to turn on transactions: > 0 on at least one instance of the Thrift metastore service | How many compactor worker threads to run on this metastore instance.** |
hive.compactor.worker.timeout | Default: 86400 | Time in seconds after which a compaction job will be declared failed and the compaction re-queued. |
hive.compactor.check.interval | Default: 300 | Time in seconds between checks to see if any tables or partitions need to be compacted.*** |
hive.compactor.delta.num.threshold | Default: 10 | Number of delta directories in a table or partition that will trigger a minor compaction. |
hive.compactor.delta.pct.threshold | Default: 0.1 | Percentage (fractional) size of the delta files relative to the base that will trigger a major compaction. 1 = 100%, so the default 0.1 = 10%. |
hive.compactor.abortedtxn.threshold | Default: 1000 | Number of aborted transactions involving a given table or partition that will trigger a major compaction. |
...
If the data in your system is not owned by the Hive user (i.e., the user that the Hive metastore runs as), then Hive will need permission to run as the user who owns the data in order to perform compactions. If you have already set up HiveServer2 to impersonate users, then the only additional work to do is assure that Hive has the right to impersonate users from the host running the Hive metastore. This is done by adding the hostname to hadoop.proxyuser.hive.hosts
in Hadoop's core-site.xml
file. If you have not already done this, then you will need to configure Hive to act as a proxy user. This requires you to set up keytabs for the user running the Hive metastore and add hadoop.proxyuser.hive.hosts
and hadoop.proxyuser.hive.groups
to Hadoop's core-site.xml
file. See the Hadoop documentation on secure mode for your version of Hadoop (e.g., for Hadoop 2.5.1 it is at Hadoop in Secure Mode).
Table Properties
If a table is to be used in ACID writes (insert, update, delete), then the table property TRANSACTIONAL
must be set on that table. Without this value, inserts will be done in the old style; updates and deletes will be prohibited.
If a table owner does not wish the system to automatically determine when to compact, then the table property NO_AUTO_COMPACTION
can be set. This will prevent all automatic compactions. Manual compactions can still be done with Alter Table/Partition Compact statements.
...