Page History

...

Maximum number of bytes a script is allowed to emit to standard error (per map-reduce task). This prevents runaway scripts from filling logs partitions to capacity.

hive.script.auto.progress

Default Value: false
Added In: Hive 0.4.0

Whether Hive Tranform/Map/Reduce Clause should automatically send progress information to TaskTracker to avoid the task getting killed because of inactivity. Hive sends progress information when the script is outputting to stderr. This option removes the need of periodically producing stderr messages, but users should be cautious because this may prevent infinite loops in the scripts to be killed by TaskTracker.

hive.exec.script.allow.partial.consumption

...

By default all values in the HiveConf object are converted to environment variables of the same name as the key (with '.' (dot) converted to '_' (underscore)) and set as part of the script operator's environment. However, some values can grow large or are not amenable to translation to environment variables. This value gives a comma separated list of configuration values that will not be set in the environment when calling a script operator. By default the valid transaction list is excluded, as it can grow large and is sometimes compressed, which does not translate well to an environment variable.

Also see:

SerDes for more hive.script.* configuration properties

hive.exec.compress.output

...

For conditional joins, if input stream from a small alias can be directly applied to the join operator without filtering or projection, the alias need not be pre-staged in the distributed cache via a mapred local task. Currently, this is not working with vectorization or Tez execution engine.

hive.

...

udtf.auto.progress

Default Value: false
Added In: Hive 0.45.0

Whether Hive Tranform/Map/Reduce Clause should automatically send progress information to TaskTracker when using UDTF's to avoid prevent the task getting killed because of inactivity. Hive sends progress information when the script is outputting to stderr. This option removes the need of periodically producing stderr messages, but users Users should be cautious because this may prevent TaskTracker from killing tasks with infinite loops in the scripts to be killed by TaskTracker.

hive.mapred.

...

reduce.tasks.speculative.execution

Default Value: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe true
Added In: Hive 0.45.0

The default SerDe for transmitting input data to and reading output data from the user scriptsWhether speculative execution for reducers should be turned on.

hive.exec.counters.

...

pull.

...

interval

Default Value: org.apache.hadoop.hive.ql.exec.TextRecordReader 1000
Added In: Hive 0.46.0

The default record reader for reading data from the user scriptsinterval with which to poll the JobTracker for the counters the running job. The smaller it is the more load there will be on the jobtracker, the higher it is the less granular the caught will be.

hive.

...

enforce.

...

bucketing

Default Value: org.apache.hadoop.hive.ql.exec.TextRecordWriter false
Added In: Hive 0.56.0

The default record writer for writing data to the user scripts.

...

Whether bucketing is enforced. If true, while inserting into the table, bucketing is enforced.

Set to true to support INSERT ... VALUES, UPDATE, and DELETE transactions (Hive 0.14.0 and later). For a complete list of parameters required for turning on Hive transactions, see hive.txn.manager.

hive.enforce.sorting

Default Value: false
Added In: Hive 0.56.0

Whether Hive should automatically send progress information to TaskTracker when using UDTF's to prevent the task getting killed because of inactivity. Users should be cautious because this may prevent TaskTracker from killing tasks with infinite loops.

...

sorting is enforced. If true, while inserting into the table, sorting is enforced.

hive.optimize.reducededuplication

Default Value: true
Added In: Hive 0.56.0

Whether speculative execution for reducers should be turned onRemove extra map-reduce jobs if the data is already clustered by the same key which needs to be used again. This should always be set to true. Since it is a new feature, it has been made configurable.

hive.

...

optimize.

...

reducededuplication.

...

min.

...

reducer

Default Value:

...

4
Added In: Hive 0.

...

The interval with which to poll the JobTracker for the counters the running job. The smaller it is the more load there will be on the jobtracker, the higher it is the less granular the caught will be.

hive.enforce.bucketing

Default Value: false
Added In: Hive 0.6.0

Whether bucketing is enforced. If true, while inserting into the table, bucketing is enforced.

Set to true to support INSERT ... VALUES, UPDATE, and DELETE transactions (Hive 0.14.0 and later). For a complete list of parameters required for turning on Hive transactions, see hive.txn.manager.

hive.enforce.sorting

Default Value: false
Added In: Hive 0.6.0

Whether sorting is enforced. If true, while inserting into the table, sorting is enforced.

hive.optimize.reducededuplication

Default Value: true
Added In: Hive 0.6.0

Remove extra map-reduce jobs if the data is already clustered by the same key which needs to be used again. This should always be set to true. Since it is a new feature, it has been made configurable.

hive.optimize.reducededuplication.min.reducer

Default Value: 4
Added In: Hive 0.11.0 with HIVE-2340

Reduce deduplication merges two RSs (reduce sink operators) by moving key/parts/reducer-num of the child RS to parent RS. That means if reducer-num of the child RS is fixed (order by or forced bucketing) and small, it can make very slow, single MR. The optimization will be disabled if number of reducers is less than specified value.

...

Setting to 0.12 (default) maintains division behavior in Hive 0.12 and earlier releases: int / int = double.
Setting to 0.13 gives division behavior in Hive 0.13 and later releases: int / int = decimal.

An invalid setting will cause an error message, and the default support level will be used.

hive.optimize.constant.propagation

Default Value: true
Added In: Hive 0.14.0 with HIVE-5771

Whether to enable the constant propagation optimizer.

hive.entity.capture.transform

Default Value: false
Added In: Hive 1.1.0 with HIVE-8938

Enable capturing compiler read entity of transform URI which can be introspected in the semantic and exec hooks.

hive.explain.user

Default Value: false
Added In: Hive 1.2.

...

0 with HIVE-9780

Whether to show explain result at user level. When enabled, will log EXPLAIN output for the query at user level.

SerDes, I/O, and File Formats

SerDes

hive.script.serde

An invalid setting will cause an error message, and the default support level will be used.

...

Default Value: `trueorg.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe`
Added In: Hive 0.14.0 with HIVE-57714.0
The default SerDe for transmitting input data to and reading output data from the user scriptsWhether to enable the constant propagation optimizer.

hive.

...

script.

...

recordreader

Default Value: falseorg.apache.hadoop.hive.ql.exec.TextRecordReader
Added In: Hive 10.14.0 with HIVE-8938

The default record reader for reading data from the user scriptsEnable capturing compiler read entity of transform URI which can be introspected in the semantic and exec hooks.

hive.

...

script.

...

recordwriter

Default Value: false org.apache.hadoop.hive.ql.exec.TextRecordWriter
Added In: Hive 1Hive 0.2.0 with HIVE-9780

Whether to show explain result at user level. When enabled, will log EXPLAIN output for the query at user level.

File Formats and I/O

5.0

The default record writer for writing data to the user scripts.

hive.default.serde

Default Value: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Added in: Hive 0.14 with HIVE-5976

...

LazySimpleSerDe uses this property to determine if it treats 'T', 't', 'F', 'f', '1', and '0' as extended, legal boolean literals, in addition to 'TRUE' and 'FALSE'. The default is false, which means only 'TRUE' and 'FALSE' . The default is false, which means only 'TRUE' and 'FALSE' are treated as legal boolean literals.are treated as legal boolean literals.

I/O

hive.io.exception.handlers

Default Value: (empty)
Added In: Hive 0.8.1

A list of I/O exception handler class names. This is used to construct a list of exception handlers to handle exceptions thrown by record readers.

hive.input.format

Default Value: org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
Added In: Hive 0.5.0

The default input format. Set this to HiveInputFormat if you encounter problems with CombineHiveInputFormat.

Also see:

hive.tez.input.format

General File Formats

hive.default.fileformat

Default Value: TextFile
Added In: Hive 0.2.0

...

Users can explicitly say CREATE TABLE ... STORED AS TEXTFILE|SEQUENCEFILE|RCFILE|ORC|AVRO|INPUTFORMAT...OUTPUTFORMAT... to override. (RCFILE was added in Hive 0.6.0, ORC in 0.11.0, and AVRO in 0.14.0.) See Row Format, Storage Format, and SerDe for details.

hive.fileformat.check

Default Value: true
Added In: Hive 0.5.0

...

File format to use for a query's intermediate results. Options are TextFile, SequenceFile, and RCfile. Set to SequenceFile if any columns are string type and contain new-line characters (HIVE-1608, HIVE-3065).

hive.io.exception.handlers

...

)

...

.

...

A list of I/O exception handler class names. This is used to construct a list of exception handlers to handle exceptions thrown by record readers.

hive.input.format

Default Value: org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
Added In: Hive 0.5.0

The default input format. Set this to HiveInputFormat if you encounter problems with CombineHiveInputFormat.

Also see:

...

RCFile Format

hive.io.rcfile.record.interval

...

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 252

New Version 253

Key

hive.script.auto.progress

hive.exec.script.allow.partial.consumption

Also see:

hive.exec.compress.output

hive.

udtf.auto.progress

hive.mapred.

reduce.tasks.speculative.execution

hive.exec.counters.

pull.

interval

hive.

enforce.

bucketing

hive.enforce.sorting

hive.optimize.reducededuplication

hive.

optimize.

reducededuplication.

min.

reducer

hive.enforce.bucketing

hive.enforce.sorting

hive.optimize.reducededuplication

hive.optimize.reducededuplication.min.reducer

hive.optimize.constant.propagation

hive.entity.capture.transform

hive.explain.user

SerDes, I/O, and File Formats

SerDes

hive.script.serde

Default Value: trueorg.apache.hadoop.hive.serde2.lazy.LazySimpleSerDeAdded In: Hive 0.14.0 with HIVE-57714.0The default SerDe for transmitting input data to and reading output data from the user scriptsWhether to enable the constant propagation optimizer.

hive.

script.

recordreader

hive.

script.

recordwriter

File Formats and I/O

hive.default.serde

I/O

hive.io.exception.handlers

hive.input.format

Also see:

General File Formats

hive.default.fileformat

hive.fileformat.check

hive.io.exception.handlers

hive.input.format

RCFile Format

hive.io.rcfile.record.interval

Default Value: `trueorg.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe`
Added In: Hive 0.14.0 with HIVE-57714.0
The default SerDe for transmitting input data to and reading output data from the user scriptsWhether to enable the constant propagation optimizer.