Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: remove broken link to old doc

WebHCat Configuration

Table of Contents

Configuration Files

The configuration for WebHCat (Templeton) merges the normal Hadoop configuration with the WebHCat-specific variables. Because WebHCat is designed to connect services that are not normally connected, the configuration is more complex than might be desirable.

...

  1. webhcat-default.xml – All the configuration variables that WebHCat needs. This file sets the defaults that ship with WebHCat and should only be changed by WebHCat developers. Do not copy this file or change it to maintain local installation settings. Because webhcat-default.xml is present in the WebHCat war file, editing a local copy of it will not change the configuration.
  2. webhcat-site.xml – The (possibly empty) configuration file in which the system administrator can set variables for their Hadoop cluster. Create this file and maintain entries in it for configuration variables that require you to override default values based on your local installation.

    Note
    titleNote

    The WebHCat server will require restart after any change to the configuration.


...

Configuration files may access the special environment variable env for all environment variables. For example, the Pig executable could be specified using:

No Format

${env.PIG_HOME}/bin/pig

Configuration variables that use a filesystem path try to have reasonable defaults. However, it's always safe to specify the full and complete path if there is any uncertainty.

Info
titleLog File Location

The webhcat-log4j.properties file sets the location of the log files created by WebHCat and some other properties of the logging system.

Configuration Variables

...

Name Default (Hive 0.11.0)

Description

templeton.port 50111

The HTTP port for the main server.

templeton.hadoop.config.dir ${env.HADOOP_CONFIG_DIR}

The path to the Hadoop configuration.

templeton.jar

${env.TEMPLETON_HOME}/share/webhcat/svr/webhcat-0.11.0.Obsolete: templeton.jar

The path to the WebHCat jar file. (Not used in recent releases, so removed in Hive 0.14.0.)

templeton.libjars ${env.TEMPLETON_HOME}/share/webhcat/svr/lib/zookeeper-3.4.5.jar

Jars to add to the classpath.

templeton.override.jars hdfs:///user/templeton/ugi.jar

Jars to add to the HADOOP_CLASSPATH for all Map Reduce jobs. These jars must exist on HDFS.

templeton.override.enabled false

Enable the override path in templeton.override.jars.

templeton.streaming.jarhdfs:///user/templeton/hadoop-streaming.jar

The HDFS path to the Hadoop streaming jar file.

templeton.hadoop${env.HADOOP_PREFIX}/bin/hadoop

The path to the Hadoop executable.

templeton.pig.archive hdfs:///user/templeton/pig-0.11.1.tar.gz

The path to the Pig archive.

templeton.pig.path pig-0.11.1.tar.gz/pig-0.11.1/bin/pig

The path to the Pig executable.

templeton.hcat${env.HCAT_PREFIX}/bin/hcat

The path to the HCatalog executable.

templeton.hive.archive hdfs:///user/templeton/hive-0.11.0.tar.gz

The path to the Hive archive.

templeton.hive.path hive-0.11.0.tar.gz/hive-0.11.0/bin/hive

The path to the Hive executable.

templeton.hive.properties

hive.metastore.local=false, hive.metastore.uris=thrift://localhost:9933, hive.metastore.sasl.enabled=false

Properties to set when running Hive (during job submission).  This is expected to be a comma-separated prop=value list. If some value is itself a comma-separated list, the escape character is '\' </description> (from Hive 0.13.1 onward).

Properties to set when running Hive. To use it in a cluster with Kerberos security enabled, set hive.metastore.sasl.enabled=false and add hive.metastore.execute.setugi=true. Using localhost in metastore uri URI does not work with Kerberos security.

templeton.exec.encoding UTF-8

The encoding of the stdout and stderr data.

templeton.exec.timeout

10000

How long in milliseconds a program is allowed to run on the WebHCat box.

templeton.exec.max-procs 16

The maximum number of processes allowed to run at once.

templeton.exec.max-output-bytes

1048576

The maximum number of bytes from stdout or stderr stored in ram.

templeton.controller.mr.child.opts -server -Xmx256m -Djava.net.preferIPv4Stack=true

Java options to be passed to WebHCat controller map task.

templeton.exec.envs HADOOP_PREFIX,HADOOP_HOME,JAVA_HOME,HIVE_HOME

The environment variables passed through to exec.

templeton.zookeeper.hosts 127.0.0.1:2181

ZooKeeper servers, as comma-separated host:port pairs.

templeton.zookeeper.session-timeout

30000

ZooKeeper session timeout in milliseconds.

templeton.callback.retry.interval 10000

How long to wait between callback retry attempts in milliseconds.

templeton.callback.retry.attempts

5

How many times to retry the callback.

templeton.storage.class org.apache.hcatalog.templeton.tool.ZooKeeperStorage

The class to use as storage.

templeton.storage.root /templeton-hadoop

The path to the directory to use for storage.

templeton.hdfs.cleanup.interval 43200000

The maximum delay between a thread's cleanup checks.

templeton.hdfs.cleanup.maxage 604800000

The maximum age of a WebHCat job.

templeton.zookeeper.cleanup.interval 43200000

The maximum delay between a thread's cleanup checks.

templeton.zookeeper.cleanup.maxage 604800000

The maximum age of a WebHCat job.

templeton.kerberos.secret

A random value

The secret used to sign the HTTP cookie value. The default value is a random value. Unless multiple WebHCat instances need to share the secret the random value is adequate.

templeton.kerberos.principal

None

The Kerberos principal to used by the server. As stated by the Kerberos SPNEGO specification, it should be USER/${HOSTNAME}@{REALM}. It does not have a default value.

templeton.kerberos.keytab None

The keytab file containing the credentials for the Kerberos principal.

templeton.hadoop.queue.name

MapReduce queue name where WebHCat map-only jobs will be submitted to. Can be used to avoid a deadlock where all map slots in the cluster are taken over by Templeton launcher tasks.

Versions: Hive 0.12.0 and later.

templeton.mapper.memory.mb

WebHCat controller job's Launch mapper's memory limit in megabytes. When submitting a controller job, WebHCat will overwrite mapreduce.map.memory.mb with this value. If empty, WebHCat will not set mapreduce.map.memory.mb when submitting the controller job, therefore the configuration in mapred-site.xml will be used.

Versions: Hive 0.14.0 and later.

templeton.frame.options.filter

Adds web server protection from clickjacking using X-Frame-Options header. The possible values are DENY, SAMEORIGIN, ALLOW-FROM <uri>.

Versions: Hive 3.0.0 and later.

Default Values

Some of the default values for WebHCat configuration variables depend on the release number. For the default values in the Hive release you are using, see the webhcat-default.xml file. It can be found in the SVN repository at:

  • http://svn.apache.org/repos/asf/hive/branches/branch-<release_number>/hcatalog/webhcat/svr/src/main/config/webhcat-default.xml

where <release_number> is 0.11, 0.12, and so on. Prior to Hive 0.11, WebHCat was in the Apache incubator.

For example:

Default values prior to Hive 0.11 are listed in the HCatalog 0.5.0 documentation:

 

Panel
titleColorindigo
titleBGColorsilver
titleNavigation Links

Previous: Installation
Next: Reference

Hive configuration: Configuring Hive, Hive Configuration Properties, Thrift Server Setup

General: WebHCat (Templeton) ManualHCatalog ManualHive Home
Old version of this document (HCatalog 0.5.0): ConfigurationWiki HomeHive Project Site