You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Installing Hive

You can install a stable release of Hive by downloading and unpacking a tarball, or you can download the source code and build Hive using Maven (release 3.6.3 and later).

Hive installation has these requirements:

  • Java 8.
  • Hadoop 3.3.6
  • Hive is commonly used in production Linux environment. Mac is a commonly used development environment. The instructions in this document are applicable to Linux and Mac.  

Installing from a Tarball

Start by downloading the most recent stable release of Hive from one of the Apache download mirrors (see Hive Releases).

Next you need to unpack the tarball. This will result in the creation of a subdirectory named hive-x.y.z (where x.y.z is the release number):

  $ tar -xzvf hive-x.y.z.tar.gz

Set the environment variable HIVE_HOME to point to the installation directory:

  $ cd hive-x.y.z
  $ export HIVE_HOME={{pwd}}

Finally, add $HIVE_HOME/bin to your PATH:

  $ export PATH=$HIVE_HOME/bin:$PATH

Installing from Source Code 

Hive is available via Git at https://github.com/apache/hive. You can download it by running the following command.

  $ git clone git@github.com:apache/hive.git

In case you want to get a specific release branch, like 4.0.0, you can run that command: 

  $ git clone -b branch-4.0 --single-branch git@github.com:apache/hive.git


To build Hive, execute the following command on the base directory:

  $ mvn clean install -Pdist,itests,iceberg -DskipTests 

It will create the subdirectory packaging/target/apache-hive-<release_string>-bin/apache-hive-<release_string>-bin/ with the following contents (example: packaging/target/apache-hive-4.0.0-beta-2-SNAPSHOT-bin/apache-hive-4.0.0-beta-2-SNAPSHOT-bin):

  • bin/: directory containing all the shell scripts
  • lib/: directory containing all required jar files
  • conf/: directory with configuration files
  • examples/: directory with sample input and query files

That directory should contain all the files necessary to run Hive. You can run it from there or copy it to a different location, if you prefer.

In order to run Hive, you must have Hadoop in your path or have defined the environment variable HADOOP_HOME with the Hadoop installation directory.

Moreover, we strongly advise users to create the HDFS directories /tmp and /user/hive/warehouse (also known as hive.metastore.warehouse.dir) and set them chmod g+w before tables are created in Hive.

Next Steps

You can begin using Hive as soon as it is installed, although you will probably want to configure it first.

Beeline CLI

The Hive home directory is packaging/target/apache-hive-<release_string>-bin/apache-hive-<release_string>-bin/.

HiveServer2 has a CLI called Beeline (see Beeline – New Command Line Shell). To use Beeline, execute the following command in the Hive home directory:

$ bin/beeline

Hive Metastore

Metadata is stored in an embedded Derby database whose disk storage location is determined by the Hive configuration variable named javax.jdo.option.ConnectionURL. By default, this location is ./metastore_db (see conf/hive-default.xml).

Using Derby in embedded mode allows at most one user at a time. To configure Derby to run in server mode, see Hive Using Derby in Server Mode.

To configure a database other than Derby for the Hive metastore, see Hive Metastore Administration.

Next Step: Configuring Hive.

HCatalog and WebHCat

HCatalog

Version

HCatalog is installed with Hive, starting with Hive release 0.11.0.

If you install Hive from the binary tarball, the hcat command is available in the hcatalog/bin directory. However, most hcat commands can be issued as hive commands except for "hcat -g" and "hcat -p". Note that the hcat command uses the -p flag for permissions but hive uses it to specify a port number. The HCatalog CLI is documented here and the Hive CLI is documented here.

HCatalog installation is documented here.

WebHCat (Templeton)

Version

WebHCat is installed with Hive, starting with Hive release 0.11.0.

If you install Hive from the binary tarball, the WebHCat server command webhcat_server.sh is in the hcatalog/sbin directory.

WebHCat installation is documented here.

  • No labels