Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Metastore 3.0 Administration

Table of Contents

Version Note

...

The Metastore persists the object definitions to a relational database (RDBMS) via DataNucleus, a Java JDO based Object Relational Mapping (ORM) layer. See XXX below See Supported RDBMSs below for a list of supported RDBMSs that can be used.

...

For details on using the Metastore without HIve, see XXX Hive, see Running the Metastore Without Hive below.

General Configuration

The metastore reads its configuration from the file metastore-site.xml.  It expects to find this file in $METASTORE_HOME/conf where $METASTORE_HOME is an environment variable.  For backwards compatibility it will also read any hive-site.xml or hive-metastoresite.xml files found in HIVE_HOME/conf.  Configuration options can also be defined on the command line (see Starting and Stopping the Service below).

Configuration values specific to running the Metastore with various RDBMSs, embedded or as a service, and without Hive are discussed in the relevant sections.  The following configuration values apply to the Metastore regardless of how it is being run.  This table covers only commonly customized configuration values.  For less commonly changed configuration values see Less Commonly Changed Configuration Parameters.

 

ParameterHive 2.0 ParameterDefault ValueDescription
metastore.warehouse.dirhive.metastore.warehouse.dir URI of the default location for tables in the default catalog and database.
metastore.authorization.storage.checkshive.metastore.authorization.storage.checksfalseShould the metastore do authorization checks against the underlying storage? For example for a drop-partition it would disallow the drop if the user does not have permissions to delete the corresponding directory from the storage.
datanucleus.schema.autoCreateAlldatanucleus.schema.autoCreateAllfalse

Auto creates the necessary schema in the RDBMS at startup if one does not exist. Set this to false after creating it once. To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended in production; run schematool command instead.

metastore.schema.verificationhive.metastore.schema.verificationtrue

Enforce metastore schema version consistency. When set to true: verify that version information stored in is compatible with one from Hive jars. Also disable automatic schema migration. Users are required to manually migrate the schema after Hive upgrade which ensures proper metastore schema migration.
When set to false: warn if the version information stored in metastore doesn't match with one from in Hive jars.

metastore.hmshandler.retry.attemptshive.hmshandler.retry.attempts10The number of times to retry a call to the meastore when there is a connection error.
metastore.hmshandler.retry.intervalhive.hmshandler.retry.interval2 secTime between retry attempts.
metastore.log4j.filehive.log4j.filenoneLog4j configuration file. If unset will look for metastore-log4j2.properties in $METASTORE_HOME/conf
metastore.stats.autogatherhive.stats.autogathertrueWhether to automatically gather basic statistics during insert commands.

 

 

RDBMS

Option 1: Embedding Derby

Option 2: External RDBMS

Supported RDBMSs

TRY_DIRECT_SQL_DDL and Postgres

Installing, Upgrading, and Checking Metastore Tables in the RDBMS

...

Embedding the Metastore in Your Process

Security Considerations

 

Metastore Server

javax.jdo.option.ConnectionURL

JDBC connection string for the data store which contains metadata

javax.jdo.option.ConnectionDriverName

JDBC Driver class name for the data store which contains metadata

hive.metastore.uris

THRIFT_URI_SELECTION

Starting and Stopping the Service

Remember to discuss command line options like defining a configuration value

High Availability

Securing the Service

CLIENT_KERBEROS_PRINCIPAL, KERBEROS_*, SSL*, USE_SSL, USE_THRIFT_SASL

 

 

Running the Metastore Without Hive

Less Commonly Changed Configuration Parameters

BATCHED_RETRIEVE_*, CLIENT_CONNECT_RETRY_DELAY, FILTER_HOOK, SERDES_USING_METASTORE_FOR_SCHEMA, SERVER_*_THREADS, 

THREAD_POOL_SIZE

Security: EXECUTE_SET_UGI

Setting up Caching: CACHED*, CATALOGS_TO_CACHE & AGGREGATE_STATS_CACHE*

Transactions: MAX_OPEN_TXNS, TXNS_*