Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The schematool command invokes the Hive schema tool with these options:

No Format
$ schematool -help
usage: schemaTool
 -dbTypealterCatalog <databaseType><arg>             Metastore database type
 -driverAlter <driver>a catalog, requires
                 Driver name for connection
 -dryRun               --catalogLocation and/or
             List SQL scripts (no execute)
 -help                  --catalogDescription parameter as well
 -catalogDescription <arg>       Print this message
 -infoDescription of new catalog
 -catalogLocation <arg>             Location of new catalog, required when
        Show config and schema details
 -initSchema                       adding Schemaa initializationcatalog
 -initSchemaTocreateCatalog <initTo><arg>             Schema initialization toCreate a catalog, versionrequires
 -metaDbType <metaDatabaseType>     Used only if upgrading the system catalog for hive
 -passWord <password>               Override config file password
 --upgradeSchemacatalogLocation parameter as well
 -createLogsTable <arg>             Create table for Schema upgradeHive
 -upgradeSchemaFrom <upgradeFrom>   Schema upgrade from a version
 -url <url>                         Connection url to the databasewarehouse/compute logs
 -userNamecreateUser <user>                   Override config file user name
Create -verbosethe Hive user, set hiveUser to
                      Only print SQL statements
(Additional catalog related options added in Hive 3.0.0 (HIVE-19135] release are below.
 -createCatalog <catalog>    the db admin user and the hive
   Create catalog with given name
 -catalogLocation <location>        Location of new catalog, required when adding a catalog
 -catalogDescription <description>  Description of new catalog
 -ifNotExists   password to the db admin password with
              If passed then it is not an error to create an existing catalog
 -moveDatabase <database>        this
 -dbOpts <databaseOpts>           Move a databaseBackend betweenDB catalogs.specific options
 All-dbType tables<databaseType> under it would still be under it as part of new catalog. ArgumentMetastore isdatabase thetype
 database-driver name.<driver> Requires --fromCatalog and --toCatalog parameters as well
 -moveTable  <table>         driver name for connection
 -dropAllDatabases   Move a table to a different database.  Argument is the table name. Requires --fromCatalog, --toCatalog, --fromDatabase, and --toDatabase 
 -toCatalog  <catalog> Drop all Hive databases (with
                 Catalog a moving database or table is going to.  This is required if you are moving a database or tableCASCADE).
 -fromCatalogThis <catalog>will remove all managed
          Catalog a moving database or table is coming from.  This is required if you are moving a database or table.
 -toDatabase  <database>   data!
 -dryRun        Database a moving table is going to.  This is required if you are moving a table.
 -fromDatabase <database>  list SQL scripts (no execute)
 -fromCatalog <arg>                 DatabaseCatalog a moving database or table is
                                    coming from.  This is required if you
 are moving a table.


                                 are moving a database or table.
 -fromDatabase <arg>                Database a moving table is coming
                                    from.  This is required if you are
                                    moving a table.
 -help                              print this message
 -hiveDb <arg>                      Hive database (for use with
                                    createUser)
 -hivePassword <arg>                Hive password (for use with
                                    createUser)
 -hiveUser <arg>                    Hive user (for use with createUser)
 -ifNotExists                       If passed then it is not an error to
                                    create an existing catalog
 -info                              Show config and schema details
 -initOrUpgradeSchema               Initialize or upgrade schema to latest
                                    version
 -initSchema                        Schema initialization
 -initSchemaTo <initTo>             Schema initialization to a version
 -mergeCatalog <arg>                Merge databases from a catalog into
                                    other, Argument is the source catalog
                                    name Requires --toCatalog to indicate
                                    the destination catalog
 -metaDbType <metaDatabaseType>     Used only if upgrading the system
                                    catalog for hive
 -moveDatabase <arg>                Move a database between catalogs.
                                    Argument is the database name.
                                    Requires --fromCatalog and --toCatalog
                                    parameters as well
 -moveTable <arg>                   Move a table to a different database.
                                    Argument is the table name. Requires
                                    --fromCatalog, --toCatalog,
                                    --fromDatabase, and --toDatabase
                                    parameters as well.
 -passWord <password>               Override config file password
 -retentionPeriod <arg>             Specify logs table retention period
 -servers <serverList>              a comma-separated list of servers used
                                    in location validation in the format
                                    of scheme://authority (e.g.
                                    hdfs://localhost:8000)
 -toCatalog <arg>                   Catalog a moving database or table is
                                    going to.  This is required if you are
                                    moving a database or table.
 -toDatabase <arg>                  Database a moving table is going to.
                                    This is required if you are moving a
                                    table.
 -upgradeSchema                     Schema upgrade
 -upgradeSchemaFrom <upgradeFrom>   Schema upgrade from a version
 -url <url>                         connection url to the database
 -userName <user>                   Override config file user name
 -validate                          Validate the database
 -verbose                           only print SQL statements
 -yes                               Don't ask for confirmation when using
                                    -dropAllDatabases.  

The dbType is required and can be one of:

The dbType "mssql" was added in Hive 0.13.1 with HIVE-6862.
No Format
 derby|mysql|postgres|oracle|mssql
Info
titleVersion
|hive

Note: dbType=hive only can be used on Hive sys schema. The others are metastore db types and in case of dbType=hive, it is mandatory to set  metaDbType as well. 

Usage Examples

  • Initialize to current schema for a new Hive setup:

    No Format
    $ schematool -dbType derby -initSchema Initializing the schema to: 4.0.0-beta-2
    Metastore connection URL:       	 jdbc:derby:;databaseName=metastore_db;create=true
    Metastore Connectionconnection Driver :   	 org.apache.derby.jdbc.EmbeddedDriver
    Metastore connection User:      	 APP
    Starting metastore schema initialization to 4.0.13.0-beta-2
    Initialization script hive-schema-4.0.13.0-beta-2.derby.sql
    Initialization script completed
    schemaTool completed
    


  • Get schema information:

    No Format
    $ schematool -dbType derby -info 
    Metastore connection URL:        	 jdbc:derby:;databaseName=metastore_db;create=true
    Metastore Connectionconnection Driver :   	 org.apache.derby.jdbc.EmbeddedDriver
    Metastore connection User:       	 APP
    Hive distribution version:       0.13.0	 4.0.0-beta-2
    Metastore schema version:        0.13.0
    schemaTool completed
    
    Attempt to get schema information with older metastore:
    	 4.0.0-beta-2 


  • Init schema to for a given version: 

    No Format
    $ schematool -dbType derby -infoinitSchemaTo 3.1.0 
    Metastore connection URL:        URL:	 jdbc:derby:;databaseName=metastore_db;create=true
    Metastore Connectionconnection Driver :   	 org.apache.derby.jdbc.EmbeddedDriver
    Metastore connection User:     	  APP
    HiveStarting distributionmetastore version:schema initialization to     0.133.1.0
    org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
    *** schemaTool failed ***
    

    Since the older metastore doesn't store the version information, the tool reports an error retrieving it.

    Initialization script hive-schema-3.1.0.derby.sql 



  • Upgrade schema from an 03.101.0 release by specifying the 'from' version:

    No Format
    $ schematool -dbType derby -upgradeSchemaFrom 0.10 -dbType derby -upgradeSchemaFrom 3.1.0 Upgrading from the user input version 3.1.0
    Metastore connection URL:        	 jdbc:derby:;databaseName=metastore_db;create=true
    Metastore Connectionconnection Driver :  	  org.apache.derby.jdbc.EmbeddedDriver
    Metastore connection User:       	 APP
    Starting upgrade metastore schema from version 0.10.0 to 0.13.0
    Upgrade script upgrade-0.10.0-to-0.11.0.derby.sql
    Completed upgrade-0.10.0-to-0.11.0.derby.sql3.1.0 to 4.0.0-beta-2
    Upgrade script upgrade-03.111.0-to-03.122.0.derby.sql
    Completed upgrade-03.111.0-to-03.122.0.derby.sql
    Upgrade script upgrade-0.12.0-to-0.13.0.derby.sql
    Completed upgrade-4.0.12.0-beta-1-to-4.0.13.0-beta-2.derby.sql
    schemaTool completed
    


  • Upgrade dry run can be used to list the required scripts for the given upgrade.

    No Format
    $ build/dist/bin/schematool -dbType derby -upgradeSchemaFrom 3.1.0.7.0 -dryRun
    Metastore Connection -dryRun Upgrading from the user input version 3.1.0
    Metastore connection URL:	 jdbc:derby:;databaseName=metastore_db;create=true
    Metastore connection Driver :   	 org.apache.derby.jdbc.EmbeddedDriver
    Metastore connection User:	       APP
    Starting upgrade metastore schema from version 03.71.0 to 0.13.0
    Upgrade script upgrade-0.74.0.0-to-0.8.0.derby.sqlbeta-2
    Upgrade script upgrade-03.81.0-to-03.92.0.derby.sql
    Upgrade script upgrade-03.92.0-to-4.0.10.0-alpha-1.derby.sql
    Upgrade script upgrade-4.0.10.0-alpha-1-to-4.0.11.0-alpha-2.derby.sql
    Upgrade script upgrade-4.0.11.0-alpha-2-to-4.0.12.0-beta-1.derby.sql
    Upgrade script upgrade-4.0.12.0-beta-1-to-4.0.13.0-beta-2.derby.sql
    schemaTool completed
    

    This is useful if you just want to find out all the required scripts for the schema upgrade.

  • Initialise Hive sys schema 

    No Format
    $ ./schematool -dbType hive -metaDbType derby -initSchema  --verbose -url="jdbc:hive2://localhost:10000"

    Note 1: As 

  • Moving a database and tables under it from default Hive catalog to a custom spark catalog

    No Format
    build/dist/bin/schematool -moveDatabase db1 -fromCatalog hive -toCatalog spark
    
    


  • Moving a table from Hive catalog to Spark Catalog

    No Format
    # Create the desired target database in spark catalog if it doesn't already exist.
    beeline ... -e "create database if not exists newdb";
    schematool -moveDatabase newdb -fromCatalog hive -toCatalog spark
    
    # Now move the table to target db under the spark catalog.
    schematool -moveTable table1 -fromCatalog hive -toCatalog spark  -fromDatabase db1 -toDatabase newdb