...
A number of configuration variables in Hive can be used by the administrator to change the behavior for their installations and user sessions. These variables can be configured in any of the following ways, shown in the order of preference:
Using the set command in the CLI or Beeline for setting session level values for the configuration variable for all statements subsequent to the set command. For example, the following command sets the scratch directory (which is used by Hive to store temporary output and plans) to
/tmp/mydir
for all subsequent statements:No Format set hive.exec.scratchdir=/tmp/mydir;
Using the
--hiveconf
option of thehive
command (in the CLI) or the orbeeline
command for the entire session. For example:No Format bin/hive --hiveconf hive.exec.scratchdir=/tmp/mydir
In
hive-site.xml
. This is used for setting values for the entire Hive configuration (see hive-site.xml and hive-default.xml.template below). For example:No Format <property> <name>hive.exec.scratchdir</name> <value>/tmp/mydir</value> <description>Scratch space for Hive jobs</description> </property>
- In server-specific configuration files (supported starting Hive 0.14). You can set metastore-specific configuration values in hivemetastore-site.xml, and hiveserver2 HiveServer2-specific configuration values in hiveserver2-site.xml.
HiveMetastore The server reads hive-site.xml as well as hivemetastore-site.xml configuration files that are available in the $HIVE_CONF_DIR or in the classpath. If metastore is being used in embedded mode (ie hive.metastore.uris is not set or empty) in hive commandline or hiveserver2, the hivemetastore-site.xml gets loaded by the parent process as well.
The value of hive.metastore.uris is examined to determine this, and the value should be set appropriately in hive-site.xml .
Certain metastore configuration parameters like hive.metastore.sasl.enabled, hive.metastore.kerberos.principal, hive.metastore.execute.setugi, hive.metastore.thrift.framed.transport.enabled are used by the metastore client as well as server. For such common parameters it is better to set the values in hive-site.xml, that will help in keeping them consistent.HiveServer2 reads hive-site.xml as well as hiveserver2-site.xml that are -specific configuration file is useful in two situations:
- You want a different configuration for one type of server (for example – enabling authorization only in HiveServer2 and not CLI).
- You want to set a configuration value only in a server-specific configuration file (for example – setting the metastore database password only in the metastore server configuration file).
HiveMetastore server reads hive-site.xml as well as hivemetastore-site.xml configuration files that are available in the $HIVE_CONF_DIR or in the classpath. If the metastore is being used in embedded mode (i.e., hive.metastore.uris is not set or empty) in
hive
commandline or HiveServer2, the hivemetastore-site.xml gets loaded by the parent process as well.
The value of hive.metastore.uris is examined to determine this, and the value should be set appropriately in hive-site.xml .
Certain metastore configuration parameters like hive.metastore.sasl.enabled, hive.metastore.kerberos.principal, hive.metastore.execute.setugi, and hive.metastore.thrift.framed.transport.enabled are used by the metastore client as well as server. For such common parameters it is better to set the values in hive-site.xml, that will help in keeping them consistent.HiveServer2 reads hive-site.xml as well as hiveserver2-site.xml that are available in the $HIVE_CONF_DIR or in the classpath.
If
...
HiveServer2 is using the metastore in embedded mode, hivemetastore-site.xml also is loaded.
The order of precedence of the config files is as follows (later one has higher precedence)
...
–
hive-site.xml -> hivemetastore-site.xml -> hiveserver2-site.xml -> '-hiveconf
' commandline parameters.
hive-site.xml and hive-default.xml.template
...
Hive client produces logs and history files on the client machine. Please see Error Logs Hive Logging for configuration details.
...
Table of Content Zone | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Also see Hive Configuration Properties in the Language Manual for non-administrative configuration variables. Hive Configuration Variables
Hive Configuration Variables
Hive Metastore Configuration VariablesPlease see Hive Metastore Administration for information about the configuration variables used to set up the metastore in local, remote, or embedded mode. Also see descriptions in the Metastore section of the Language Manual's Hive Configuration Properties. For security configuration (Hive 0.10 and later), see the Hive Metastore Security section in the Language Manual's Hive Configuration Properties. Hive Configuration Variables Used to Interact with Hadoop
Hive Variables Used to Pass Run Time Information | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Variable Name | Description | Default Value | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
hive.session.id | The id of the Hive Session. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
hive.query.string | The query string passed to the map/reduce job. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
hive.query.planid | The id of the plan for the map/reduce stage. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
hive.jobname.length | The maximum length of the jobname. | 50 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
hive.table.name | The name of the hive table. This is passed to the user scripts through the script operator. |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
hive.partition.name | The name of the hive partition. This is passed to the user scripts through the script operator. |
| hive.alias | The alias being processed. This is also passed to the user scripts through the script operator. |
jdbc:derby:;databaseName=TempStatsStore;create=true | ||
hive.stats.jdbcdriver | The JDBC driver for the database that stores temporary hive statistics. (As of Hive 0.7.0.) | org.apache.derby.jdbc.EmbeddedDriver |
hive.stats.reliable | Whether queries will fail because stats cannot be collected completely accurately. If this is set to true, reading/writing from/into a partition may fail becuase the stats could not be computed accurately. (As of Hive 0.10.0.) | false |
hive.enforce.bucketing | If enabled, enforces inserts into bucketed tables to also be bucketed. (Hive 0.6.0 through Hive 1.x.x only) | false |
hive.variable.substitute | Substitutes variables in Hive statements which were previously set using the | true |
hive.variable.substitute.depth | The maximum replacements the substitution engine will do. (As of Hive 0.10.0.) | 40 |
hive.vectorized.execution.enabled | This flag controls the vectorized mode of query execution as documented in HIVE-4160. (As of Hive 0.13.0.) |
|
Hive Metastore Configuration Variables
Please see Hive Metastore Administration for information about the configuration variables used to set up the metastore in local, remote, or embedded mode. Also see descriptions in the Metastore section of the Language Manual's Hive Configuration Properties.
For security configuration (Hive 0.10 and later), see the Hive Metastore Security section in the Language Manual's Hive Configuration Properties.
Configuration Variables Used to Interact with Hadoop
Variable Name | Description | Default Value | ||||||
hadoop.bin.path | The location of the Hadoop script which is used to submit jobs to Hadoop when submitting through a separate JVM. | $HADOOP_HOME/bin/hadoop | ||||||
hadoop.config.dir | The location of the configuration directory of the Hadoop installation. | $HADOOP_HOME/conf | ||||||
fs.default.name | The default name of the filesystem (for example, localhost for hdfs://<clustername>:8020). For YARN this configuration variable is called fs.defaultFS. | file:/// | ||||||
map.input.file | The filename the map is reading from. | null | ||||||
mapred.job.tracker | The URL to the jobtracker. If this is set to local then map/reduce is run in the local mode. | local | ||||||
mapred.reduce.tasks | The number of reducers for each map/reduce stage in the query plan. | 1 | ||||||
mapred.job.name | The name of the map/reduce job. | null | ||||||
mapreduce.input.fileinputformat.split.maxsize | For splittable data this changes the portion of the data that each mapper is assigned. By default, each mapper is assigned based on the block sizes of the source files. Entering a value larger than the block size will decrease the number of splits which creates fewer mappers. Entering a value smaller than the block size will increase the number of splits which creates more mappers. | empty | ||||||
fs.trash.interval | The interval, in minutes, after which a trash checkpoint directory is deleted. (This is also the interval between checkpoints.) The checkpoint directory is located in Any setting greater than 0 enables the trash feature of HDFS. When using the Transparent Data Encryption (TDE) feature, set this to 0 in Hadoop core-site.xml as documented in HIVE-10978. | 0 |
Hive Variables Used to Pass Run Time Information
Variable Name | Description | Default Value |
hive.session.id | The id of the Hive Session. |
|
hive.query.string | The query string passed to the map/reduce job. |
|
hive.query.planid | The id of the plan for the map/reduce stage. |
|
hive.jobname.length | The maximum length of the jobname. | 50 |
hive.table.name | The name of the Hive table. This is passed to the user scripts through the script operator. |
|
hive.partition.name | The name of the Hive partition. This is passed to the user scripts through the script operator. |
|
hive.alias | The alias being processed. This is also passed to the user scripts through the script operator. |
|
Removing Hive Metastore Password from Hive Configuration
Support for this was added in Hive 0.14.0 with HIVE-7634 and HADOOP-10904. By setting up a CredentialProvider to handle storing/retrieval of passwords, you can remove the need to keep the Hive metastore password in cleartext in the Hive configuration.
Set up the CredentialProvider to store the Hive Metastore password, using the key javax.jdo.option.ConnectionPassword (the same key as used in the Hive configuration). For example, the following command adds the metastore password to a JCEKS keystore file at /usr/lib/hive/conf/hive.jceks:
No Format $ hadoop credential create javax.jdo.option.ConnectionPassword -provider jceks://file/usr/lib/hive/conf/hive.jceks Enter password: Enter password again: javax.jdo.option.ConnectionPassword has been successfully created. org.apache.hadoop.security.alias.JavaKeyStoreProvider has been updated.
Make sure to restrict access to this file to just the user running the Hive Metastore server/HiveServer2.
See http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CommandsManual.html#credential for more information.Update the Hive configuration to use the designated CredentialProvider. For example to use our /usr/lib/hive/conf/hive.jceks file:
No Format <!-- Configure credential store for passwords--> <property> <name>hadoop.security.credential.provider.path</name> <value>jceks://file/usr/lib/hive/conf/hive.jceks</value> </property>
This configures the CredentialProvider used by http://hadoop.apache.org/docs/current/api/org/apache/hadoop/conf/Configuration.html#getPassword(java.lang.String), which is used by Hive to retrieve the metastore password.
- Remove the Hive Metastore password entry (javax.jdo.option.ConnectionPassword) from the Hive configuration. The CredentialProvider will be used instead.
- Restart Hive Metastore Server/HiveServer2.
Configuring HCatalog and WebHCat
...
Starting in Hive release 0.11.0, HCatalog is installed and configured with Hive. The HCatalog server is the same as the Hive metastore.
- See Hive Metastore Administration for metastore configuration properties.
- See HCatalog Installation from Tarball for additional information.
...
For information about configuring WebHCat, see WebHCat Configuration.