You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Hive now records the schema version in the metastore database and verifies that the metastore schema version is compatible hive binaries that are going to accesss the meatstore. Note that the hive properties to implicitly create or alter the existing schema are disabled by default. Hive will not attempt to change the metastore schema implicitly. When you execute a hive query against a old schema, it will fail to access metastore

$ build/dist/bin/hive -e "show tables"
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

The log will contain an error about version information not found

...
Caused by: MetaException(message:Version information not found in metastore. )
...

To suppress the schema check and allow metastore to implicitly modify the schema, you need to set a configuration property hive.metastore.schema.verification to false in hive-site.xml

Hive distribution now includes an offline tool for hive metastore schema manipulation. This tool can be used to initialize the metastore schema for the current hive version. It can also handle upgrading schema from an older version to current. It tries to find the current schema from metastore if its available. This will be applicable to future upgrades like 0.12.0 to 0.13.0. In case of upgrades from older releases like 0.7.0 or 0.10.0, you can specify the schema version of the existing metastore as a command line option to the tool.
The schmeatool figures out the sql scripts required to initialize or upgrade the schema and then execute those scripts against the backend database. The metastore DB connection information like JDBC URL, JDBC driver and DB credentials are extracted from the hive configuration. You can provide alternate DB credentials if needed.

  • schematool Usage
    $ schematool -help
    usage: schemaTool
     -dbType <databaseType>             Metastore database type
     -dryRun                            list SQL scripts (no execute)
     -help                              print this message
     -info                              Show config and schema details
     -initSchema                        Schema initialization
     -initSchemaTo <initTo>             Schema initialization to a version
     -passWord <password>               Override config file password
     -upgradeSchema                     Schema upgrade
     -upgradeSchemaFrom <upgradeFrom>   Schema upgrade from a version
     -userName <user>                   Override config file user name
     -verbose                           only print SQL statements
    
    The dbType is required and can be one of
     derby|mysql|postgres|oracle 
  • Initialize to current schema for a new hive setup
    $ schematool -dbType derby -initSchema
    Metastore connection URL:        jdbc:derby:;databaseName=metastore_db;create=true
    Metastore Connection Driver :    org.apache.derby.jdbc.EmbeddedDriver
    Metastore connection User:       APP
    Starting metastore schema initialization to 0.13.0
    Initialization script hive-schema-0.13.0.derby.sql
    Initialization script completed
    schemaTool completeted
    
  • Get schema information
    $ schematool -dbType derby -info
    Metastore connection URL:        jdbc:derby:;databaseName=metastore_db;create=true
    Metastore Connection Driver :    org.apache.derby.jdbc.EmbeddedDriver
    Metastore connection User:       APP
    Hive distribution version:       0.13.0
    Metastore schema version:        0.13.0
    schemaTool completeted
    
  • Attempt to get schema information with older metastore
    $ schematool -dbType derby -info
    Metastore connection URL:        jdbc:derby:;databaseName=metastore_db;create=true
    Metastore Connection Driver :    org.apache.derby.jdbc.EmbeddedDriver
    Metastore connection User:       APP
    Hive distribution version:       0.13.0
    org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
    *** schemaTool failed ***
    
    Since the older metastore don't store the version informat, the tool reports error retrieving it
  • Upgrade schema from an 0.10.0 release by specifying the 'from' version
    $ schematool -dbType derby -upgradeSchemaFrom 0.10.0
    Metastore connection URL:        jdbc:derby:;databaseName=metastore_db;create=true
    Metastore Connection Driver :    org.apache.derby.jdbc.EmbeddedDriver
    Metastore connection User:       APP
    Starting upgrade metastore schema from version 0.10.0 to 0.13.0
    Upgrade script upgrade-0.10.0-to-0.11.0.derby.sql
    Completed upgrade-0.10.0-to-0.11.0.derby.sql
    Upgrade script upgrade-0.11.0-to-0.12.0.derby.sql
    Completed upgrade-0.11.0-to-0.12.0.derby.sql
    Upgrade script upgrade-0.12.0-to-0.13.0.derby.sql
    Completed upgrade-0.12.0-to-0.13.0.derby.sql
    schemaTool completeted
    
  • Upgrade dry run can be used to list the required scripts for the given upgrade.
    $ build/dist/bin/schematool -dbType derby -upgradeSchemaFrom 0.7.0 -dryRun
    Metastore Connection Driver :    org.apache.derby.jdbc.EmbeddedDriver
    Metastore connection User:       APP
    Starting upgrade metastore schema from version 0.7.0 to 0.13.0
    Upgrade script upgrade-0.7.0-to-0.8.0.derby.sql
    Upgrade script upgrade-0.8.0-to-0.9.0.derby.sql
    Upgrade script upgrade-0.9.0-to-0.10.0.derby.sql
    Upgrade script upgrade-0.10.0-to-0.11.0.derby.sql
    Upgrade script upgrade-0.11.0-to-0.12.0.derby.sql
    Upgrade script upgrade-0.12.0-to-0.13.0.derby.sql
    schemaTool completeted
    
    This is useful if you just want to find out the all required scripts for the schema upgrade.
  • No labels