Page History

...

Broadly the configuration variables for Hive administration are categorized into:

Table of Content Zone

location	top
type	list

Also see Hive Configuration Properties in the Language Manual for non-administrative configuration variables.

Hive Configuration Variables

Variable Name	Description	Default Value
hive.ddl.output.format	The data format to use for DDL output (e.g. `DESCRIBE table`). One of "text" (for human readable text) or "json" (for a json object). (as of Hive 0.9.0)	text
hive.exec.script.wrapper	Wrapper around any invocations to script operator e.g. if this is set to python, the script passed to the script operator will be invoked as `python <script command>`. If the value is null or not set, the script is invoked as `<script command>`.	null
hive.exec.plan		null
hive.exec.scratchdir	This directory is used by Hive to store the plans for different map/reduce stages for the query as well as to stored the intermediate outputs of these stages.	/tmp/<user.name>/hive (Hive 0.8.0 and earlier) /tmp/hive-<user.name> (as of Hive 0.8.1)
hive.exec.local.scratchdir	This directory is used for temporary files when Hive runs in local mode. (as of Hive 0.10.0)	/tmp/<user.name>
hive.exec.submitviachild	Determines whether the map/reduce jobs should be submitted through a separate jvm in the non local mode.	false - By default jobs are submitted through the same jvm as the compiler
hive.exec.script.maxerrsize	Maximum number of serialization errors allowed in a user script invoked through `TRANSFORM` or `MAP` or `REDUCE` constructs.	100000
hive.exec.compress.output	Determines whether the output of the final map/reduce job in a query is compressed or not.	false
hive.exec.compress.intermediate	Determines whether the output of the intermediate map/reduce jobs in a query is compressed or not.	false
hive.jar.path	The location of hive_cli.jar that is used when submitting jobs in a separate jvm.
hive.aux.jars.path	The location of the plugin jars that contain implementations of user defined functions and serdes.
hive.partition.pruning	A strict value for this variable indicates that an error is thrown by the compiler in case no partition predicate is provided on a partitioned table. This is used to protect against a user inadvertently issuing a query against all the partitions of the table.	nonstrict
hive.map.aggr	Determines whether the map side aggregation is on or not.	true
hive.join.emit.interval		1000
hive.map.aggr.hash.percentmemory		(float)0.5
hive.default.fileformat	Default file format for CREATE TABLE statement. Options are TextFile, SequenceFile, RCFile, and Orc.	TextFile
hive.merge.mapfiles	Merge small files at the end of a map-only job.	true
hive.merge.mapredfiles	Merge small files at the end of a map-reduce job.	false
hive.merge.size.per.task	Size of merged files at the end of the job.	256000000
hive.merge.smallfiles.avgsize	When the average output file size of a job is less than this number, Hive will start an additional map-reduce job to merge the output files into bigger files. This is only done for map-only jobs if hive.merge.mapfiles is true, and for map-reduce jobs if hive.merge.mapredfiles is true.	16000000
hive.querylog.enable.plan.progress	Whether to log the plan's progress every time a job's progress is checked. These logs are written to the location specified by `hive.querylog.location` (as of Hive 0.10)	true
hive.querylog.location	Directory where structured hive query logs are created. One file per session is created in this directory. If this variable set to empty string structured log will not be created.	/tmp/<user.name>
hive.querylog.plan.progress.interval	The interval to wait between logging the plan's progress in milliseconds. If there is a whole number percentage change in the progress of the mappers or the reducers, the progress is logged regardless of this value. The actual interval will be the ceiling of (this value divided by the value of `hive.exec.counters.pull.interval`) multiplied by the value of `hive.exec.counters.pull.interval` i.e. if it is not divide evenly by the value of `hive.exec.counters.pull.interval` it will be logged less frequently than specified. This only has an effect if `hive.querylog.enable.plan.progress` is set to `true`. (as of Hive 0.10)	60000
hive.stats.autogather	A flag to gather statistics automatically during the INSERT OVERWRITE command. (as of Hive 0.7.0)	true
hive.stats.dbclass	The default database that stores temporary hive statistics. Valid values are `hbase` and `jdbc` while `jdbc` should have a specification of the Database to use, separatey by a colon (e.g. `jdbc:mysql` (as of Hive 0.7.0)	jdbc:derby
hive.stats.dbconnectionstring	The default connection string for the database that stores temporary hive statistics. (as of Hive 0.7.0)	jdbc:derby:;databaseName=TempStatsStore;create=true
hive.stats.jdbcdriver	The JDBC driver for the database that stores temporary hive statistics. (as of Hive 0.7.0)	org.apache.derby.jdbc.EmbeddedDriver
hive.stats.reliable	Whether queries will fail because stats cannot be collected completely accurately. If this is set to true, reading/writing from/into a partition may fail becuase the stats could not be computed accurately (as of Hive 0.10.0)	false
hive.enforce.bucketing	If enabled, enforces inserts into bucketed tables to also be bucketed	false
hive.variable.substitute	Substitutes variables in Hive statements which were previously set using the `set` command, system variables or environment variables. See HIVE-1096 for details. (as of Hive 0.7.0)	true
hive.variable.substitute.depth	The maximum replacements the substitution engine will do. (as of Hive 0.10.0)	40

Hive Metastore Configuration Variables

Please see the Admin Manual's section on the Metastore for details.

For security configuration (Hive 0.10 and later), see the Hive Metastore Security section in the Language Manual's Configuration Properties.

Hive Configuration Variables Used to Interact with Hadoop

Variable Name	Description	Default Value
hadoop.bin.path	The location of hadoop script which is used to submit jobs to hadoop when submitting through a separate jvm.	$HADOOP_HOME/bin/hadoop
hadoop.config.dir	The location of the configuration directory of the hadoop installation	$HADOOP_HOME/conf
fs.default.name		file:///
map.input.file		null
mapred.job.tracker	The url to the jobtracker. If this is set to local then map/reduce is run in the local mode.	local
mapred.reduce.tasks	The number of reducers for each map/reduce stage in the query plan.	1
mapred.job.name	The name of the map/reduce job	null

Hive Variables Used to Pass Run Time Information

Variable Name	Description	Default Value
hive.session.id	The id of the Hive Session.
hive.query.string	The query string passed to the map/reduce job.
hive.query.planid	The id of the plan for the map/reduce stage.
hive.jobname.length	The maximum length of the jobname.	50
hive.table.name	The name of the hive table. This is passed to the user scripts through the script operator.
hive.partition.name	The name of the hive partition. This is passed to the user scripts through the script operator.
hive.alias	The alias being processed. This is also passed to the user scripts through the script operator.

...

Space shortcuts

Child pages

Versions Compared

Old Version 21

New Version 22

Key

Hive Configuration Variables

Hive Metastore Configuration Variables

Hive Configuration Variables Used to Interact with Hadoop

Hive Variables Used to Pass Run Time Information