Hive Metastore Administration
Table of Contents |
---|
Introduction
All the metadata for Hive tables and partitions are stored in Hive Metastore. Metadata is persisted using JPOX ORM solution so any store that is supported by it can be used by Hive. Most of the commercial relational databases and many open source datstores are supported. Any datastore that has JDBC driver can probably be used.
...
There are 3 different ways to setup metastore server using different Hive configurations. :
Table of Content Zone | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The relevant configuration parameters are shown here. (Non-metastore parameters are described in Configuring Hive and the Language Manual's Hive Configuration Properties.)
These variables were carried over from old documentation without a guarantee that they all still exist (see the
Default configuration sets up an embedded metastore which is used in unit tests and is described in the next section. More practical options are described in the subsequent sections. Embedded Metastore |
...
An embedded metastore is mainly used for unit tests |
...
. Only one process can connect to the metastore at a time |
...
, so it is not really a practical solution but works well for unit tests. Derby is the default database for the embedded metastore.
If you want to run |
...
Derby as a network server so |
...
the metastore can be accessed from multiple nodes |
...
, see Hive Using Derby in Server Mode. Local MetastoreIn local metastore setup, each Hive Client will open a connection to the datastore and make SQL queries against it. The following config will setup a metastore in a MySQL server. Make sure that the server accessible from the machines where Hive queries are executed since this is a local store. Also the jdbc client library is in the classpath of Hive Client.
Remote MetastoreIn remote metastore setup, all Hive Clients will make a connection a metastore server which in turn queries the datastore (MySQL in this example) for metadata. Metastore server and client communicate using Thrift Protocol. Starting with Hive 0.5.0, you can start a thrift server by executing the following command:
In versions of Hive earlier than 0.5.0, it's instead necessary to run the thrift server via direct execution of Java:
If you execute Java directly, then JAVA_HOME, HIVE_HOME, HADOOP_HOME must be correctly set; CLASSPATH should contain Hadoop, Hive (lib and auxlib), and Java jars. Server Configuration Parameters
Client Configuration Parameters
If you are using MySQL as the datastore for metadata, put MySQL client libraries in HIVE_HOME/lib before starting Hive Client or HiveMetastore Server. To change the metastore port, use this
|