Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

– This document is a work in progress.

Overview

There are presently three ways to issue HCatalog DDL commands:

  1. Command line interface
  2. Templeton REST APIs (upcoming)
  3. HiveMetaStore Client

Presently, java developers go through the Hive meta store (HMS) client interface to issue HCatalog DDl commands. Though the HMS client interface is public, it is not intended for public users. According to the hive user mailing list, the HMS client is not a public API and is subject to change in the future. So, it will be a good idea to have a java APIs in HCatalog which will provide a protect users from the changes made to the hive meta store client. Also, the under the covers either the Templeton Rest APIS or the hive metastore client can be used to provide end users with the required data.

Design

Image RemovedImage Added

New Classes

HCatClient

The HCatClient is an abstract class containing all the APIs permitted HCatalog DDL commands. The implementation class will be provided as a configuration property, which will be used by the
"create" method. In this way, the implementation details will be masked to the users.

Code Block
/**
 * The abstract class HCatClient containing APIs for HCatalog DDL commands.
 */
public abstract class HCatClient {

    /**
     * Creates an instance of HCatClient.
     *
     * @param conf An instance of configuration.
     * @return An instance of HCatClient.
     * @throws IOException
     */
    public static HCatClient create(Configuration conf) throws IOException{
      HCatClient client // Obtain details of implementation class and return an instance.
    }

= HCatUtil.getHCatClient(conf);
        if(client != null){
      /**
     * Gets the database like.
 client.initialize(conf);
        *}
      * @param regexreturn Theclient;
 regular expression. Providing "*" would retrieve all the names
     *    }

    abstract void initialize(Configuration conf) throws HCatException;

    /**
     * Get all existing databases that ofmatch the databases.given
     * @returnpattern. The listmatching occurs ofas allper theJava databaseregular names.expressions
     *
 @throws HCatException
   * @param */databasePattern
     *          java re pattern
     * @return list of database names
     * @throws HCatException
     */
    public abstract List<String> getDatabaseLikelistDatabaseNamesByPattern(String regexpattern) throws HCatException;

    /**
     * Gets the database.
     *
     * @param dbName The name of the database.
     * @return An instance of HCatDatabaseInfo.
     * @throws HCatException
     */
    public abstract HCatDatabase getDatabase(String dbName) throws HCatException;

    /**
     * Creates the database.
     *
     * @param dbInfo An instance of HCatCreateDBDesc.
     * @return true, if successful
     * @throws HCatException
@throws HCatException
     */
    public abstract booleanvoid createDatabase(HCatCreateDBDesc dbInfo)
            throws HCatException;

    /**
     * DeletesDrops a database.
     *
     * @param dbName The name of the database to delete.
     * @param ifExists Hive returns an error if the database specified does not exist,
     *                 unless ifExists is set to true.
     * @param mode This is set to either "restrict" or "cascade". Restrict will
     *             remove the schema if all the tables are empty. Cascade removes
     *             everything including data and definitions.
     * @param@throws userGroupHCatException
 The user group to use*/
     * @param permissions The permissions string to use. The format is "rwxrw-r-x".public abstract void dropDatabase(String dbName, boolean ifExists, String mode) throws HCatException;

    /**
     * Returns @returnall true,existing iftables successful
from the specified database which *match @throwsthe HCatExceptiongiven
     */
 pattern. The matching publicoccurs abstractas booleanper deleteDatabase(String dbName, boolean ifExists, String mode,
    Java regular expressions.
     * @param dbName
     * @param tablePattern
 String userGroup, String permissions) throws HCatException;

    /*** @return list of table names
     * Gets the tables like a pattern specified.@throws HCatException
     */
    public * @paramabstract List<String> listTableNamesByPattern(String dbName, TheString nametablePattern)
 of the database.
     * @param regex The regular expression. Providing "*" would retrieve all the namesthrows HCatException;

    /**
     * Gets the table.
     *
     * @param dbName The     name of  the tabledatabase.
     * @return@param tableName AThe listname of allthe table names matching the specified pattern.
     * @return An instance of HCatTableInfo.
     * @throws HCatException
     */
    public abstract List<String>HCatTable getTablesLikegetTable(String dbName, String regextableName)
            throws HCatException;

    /**
     * GetsCreates the table.
     *
     * @param dbNamecreateTableDesc TheAn nameinstance of theHCatCreateTableDesc databaseclass.
     * @param@throws tableNameHCatException Thethe nameh of the table.cat exception
     * @return An instance of HCatTableInfo.
/
    public abstract void createTable(HCatCreateTableDesc createTableDesc)
      * @throws      throws HCatException;

     /**/
    public abstract HCatTable getTable(String dbName, String tableName)* Creates the table like an existing table.
     *
     * @param throws HCatException;

    /**dbName The name of the database.
     * Creates@param theexistingTblName table.
The name of the existing *table.
     * @param createTableDescnewTableName AnThe instancename of the HCatCreateTableDescnew classtable.
     * @return@param ifNotExists If true, if successful.
     * @throws HCatException the h cat exception
     */
    public abstract boolean createTable(HCatCreateTableDesc createTableDesc)then error related to already table existing is skipped.
     * @param isExternal Set to "true", if table has be created at a different
     *       throws HCatException;

    /**
     * Creates thelocation tableother like an existing table.
     *than default.
     * @param dbNamelocation The namelocation offor the databasetable.
     * @param@throws existingTblNameHCatException
 The name of the existing table. */
    public *abstract @paramvoid newTableName The name of the new table.
createTableLike(String dbName, String existingTblName,
        * @param  ifExists theString ifnewTableName, exists
boolean ifNotExists, boolean isExternal,
  * @param isExternal Set to "true", if table has be createdString atlocation) athrows differentHCatException;

     /**
     * Drop table.
     *
     * @param dbName The locationname otherof thanthe defaultdatabase.
     * @param locationtableName The locationname forof the table.
     * @return@param true,ifExists ifHive successful
returns an error if the *database @throwsspecified HCatException
does     */not exist,
    public abstract* boolean createTableLike(String dbName, String existingTblName,
            Stringunless newTableName,ifExists booleanis ifExists,set booleanto isExternal,true.
     * @throws HCatException
     String location*/
    public abstract void dropTable(String dbName, String tableName,
            boolean ifExists) throws HCatException;

    /**
     * DeleteRenames a table.
     *
     * @param dbName The name of the database.
     * @param tableNameoldName The name of the table to be renamed.
     * @param ifExistsnewName HiveThe returnsnew anname error ifof the databasetable.
 specified does not exist,  * @throws HCatException
     */
    public abstract void renameTable(String dbName, String oldName, String newName) throws HCatException;

    unless ifExists is set to true/**
     * Gets all the partitions.
     *
     * @param userGroupdbName The username groupof tothe usedatabase.
     * @param permissionstblName The permissionsname stringof to use. The format is "rwxrw-r-x"the table.
     * @return true, if successfulA list of partition names.
     * @throws HCatException the h cat exception
     */
    public abstract booleanList<HCatPartition> deleteTablegetPartitions(String dbName, String tableName,tblName)
            boolean ifExists, String userGroup, String permissions)
            throws HCatException;

    /throws HCatException;

    /**
     * RenamesGets athe tablepartition.
     *
     * @param dbName The database name of the database.
     * @param oldNametableName The name of the table to be renamedname.
     * @param newNamepartitionName The newpartition name, ofComma theseparated table.
list of col_name='value'.
   * @param userGroup* The@return userAn groupinstance toof useHCatPartitionInfo.
     * @param@throws permissionsHCatException
 The permissions string to use.*/
 The format is "rwxrw-r-x".
     * @return true, if successful
 public abstract HCatPartition getPartition(String dbName, String tableName,
       * @throws HCatException
     */
    public abstract boolean renameTable(String dbName, String oldName, String newName,
            String userGroup, String permissionspartitionName) throws HCatException;

    /**
     * Gets allAdds the partitionspartition.
     *
     * @param dbNamepartInfo TheAn nameinstance of the databaseHCatAddPartitionDesc.
     * @param@throws tblNameHCatException Thethe nameh of the table.cat exception
     * @return A list of partition names./
    public abstract void addPartition(HCatAddPartitionDesc partInfo) throws HCatException;

    /**
 * @throws HCatException the h* catDrops exceptionpartition.
     */
    public abstract* List<HCatPartition>@param getPartitions(String dbName, String tblName)dbName The database name.
     * @param tableName The table name.
  throws HCatException;

  * @param /**
partitionName The partition name, Comma *separated Getslist the partitionof col_name='value'.
     *
 @param ifExists Hive returns *an @paramerror dbNameif Thethe databasepartition name.
specified does not exist, unless *ifExists @paramis tableNameset Theto table nametrue.
     * @param@throws partitionNameHCatException
 The partition name, Comma separated list of col_name='value'. */
    public *abstract @returnvoid An instance of HCatPartitionInfo.dropPartition(String dbName, String tableName,
     * @throws HCatException
     */
    public abstract HCatPartition getPartition(String dbNamepartitionName, String tableName,
            String partitionNameboolean ifExists) throws HCatException;

    /**
     * AddsList partitions theby partitionfilter.
     *
     * @param partInfodbName AnThe instancedatabase of HCatAddPartitionDescname.
     * @return@param true,tblName ifThe successful
table name.
     * @throws@param HCatExceptionfilter theThe h cat exceptionfilter string,
     */
    publicfor abstractexample boolean addPartition(HCatAddPartitionDesc partInfo) throws HCatException;

    /**"part1 = \"p1_abc\" and part2 <= "\p2_test\"". Filtering can
     * Deletes partition.
  be done  *only on string partition keys.
     * @param@return dbNamelist The database name.of partitions
     * @param@throws HCatException tableNamethe Theh tablecat name.exception
     */
 @param partitionName The partitionpublic name,abstract CommaList<HCatPartition> separated list of col_name='value'.listPartitionsByFilter(String dbName, String tblName,
     * @param ifExists Hive returns an error if the partition specified does not exist, unless ifExists is set to true     String filter) throws HCatException;

    /**
     * Mark partition for event.
     *
     * @param userGroupdbName The user group to usedatabase name.
     * @param permissionstblName The table permissionsname.
 string to use. The format is "rwxrw-r-x".   * @param partKVs the part k vs
     * @param @returneventType true,the ifevent successfultype
     * @throws HCatException the h cat exception
     */
    public abstract booleanvoid deletePartitionmarkPartitionForEvent(String dbName, String tableNametblName,
            String partitionNameMap<String, booleanString> ifExistspartKVs, StringPartitionEventType userGroup,eventType)
            String permissions) throws HCatExceptionthrows HCatException;

    /**
     * List partitions by filterChecks if is partition marked for event.
     *
     * @param dbName Thethe databasedb name.
     * @param tblName Thethe tabletbl name.
     * @param partKVs filterthe Thepart filterk string,vs
     * @param eventType the forevent exampletype
 "part1 = \"p1_abc\" and part2* <= "\p2_test\"". Filtering can
     *    be done only on string partition keys.
     * @return list of partitions@return true, if is partition marked for event
     * @throws HCatException the h cat exception
     */
    public abstract List<HCatPartition>boolean listPartitionsByFilterisPartitionMarkedForEvent(String dbName, String tblName,
            Map<String, StringString> filter)partKVs, throwsPartitionEventType HCatException;eventType)

    /**
     * Mark partition for event.throws HCatException;

     /**
     * @param dbName The database name.Gets the delegation token.
     *
     * @param tblNameowner The table name.the owner
     * @param partKVsrenewerKerberosPrincipalName the renewer partkerberos kprincipal vsname
     * @param eventType@return the eventdelegation typetoken
     * @throws HCatException the h cat exception
     */
    public abstract voidString markPartitionForEventgetDelegationToken(String dbNameowner, String tblName,
            Map<String, String> partKVs, PartitionEventType eventType)
 renewerKerberosPrincipalName) throws
           throws HCatException;

    /**
     * ChecksRenew if is partition marked for eventdelegation token.
     *
     * @param dbNametokenStrForm the token dbstr nameform
     * @param tblName@return the tbllong
 name
     * @param@throws partKVsHCatException the parth kcat vsexception
     */
 @param eventType the eventpublic type
abstract long renewDelegationToken(String tokenStrForm) throws *HCatException;

 @return true, if is partition marked for event /**
     * @throwsCancel HCatException the h cat exceptiondelegation token.
     */
    public abstract* @param booleantokenStrForm isPartitionMarkedForEvent(String dbName, String tblName,the token str form
     * @throws HCatException the h cat exception
 Map<String, String> partKVs, PartitionEventType eventType)*/
    public abstract void cancelDelegationToken(String tokenStrForm)    throws HCatException;

    /**
     * GetsClose the delegationhcatalog tokenclient.
     *
     * @param@throws ownerHCatException the h cat ownerexception
     */
 @param renewerKerberosPrincipalName the renewer kerberos principal name
     * @return the delegation token
     * @throws HCatException the h cat exception
     */
    public abstract String getDelegationToken(String owner, String renewerKerberosPrincipalName) throws
        HCatException;

    /**
     * Renew delegation token.
     *
     * @param tokenStrForm the token str form
     * @return the long
     * @throws HCatException the h cat exception
     */
    public abstract long renewDelegationToken(String tokenStrForm) throws HCatException;

    /**
     * Cancel delegation token.
     *
     * @param tokenStrForm the token str form
     * @throws HCatException the h cat exception
     */
    public abstract void cancelDelegationToken(String tokenStrForm) throws HCatException;

}
HCatCommandDesc

This is an abstract class that helps in validating user input, building valid command descriptors and queries.

Code Block

/**
 * The Class HCatCommandDesc contains methods which help in validating,
 * building command descriptors and queries.
 */
public abstract class HCatCommandDesc{

    public abstract void validateCommandDescclose() throws HCatException;
    abstract String buildQuery() throws HCatException;
    abstract boolean isValidationComplete();

}
HCatCreateTableDesc

This class is a sub class of HCatCommandDesc and will be used by the users to create descriptor and validate it for the "create table" command.
Image Removed Image Added

HCatCreateDBDesc

This class is a sub class of HCatCommandDesc and will be used by the users to create descriptors and validate it for the "create database" command.

Image Removed!createdb.png|

HCatAddPartitionDesc

This class is a sub class of HCatCommandDesc and will be used by the users to create descriptos descriptors and validate it for the "add partition" command.

Image Removed Image Added

HCatTable

This class encapsulates the table information returned the HCatClient implementation class and provides a uniform view to the user.

Image RemovedImage Added

HCatDatabase

This class encapsulates the database information returned the HCatClient implementation class and provides a uniform view to the user.

Image RemovedImage Added

HCatPartition

This class encapsulates the partition information returned the HCatClient implementation class and provides a uniform view to the user.

Image RemovedImage Added

Usage

Code Block
 Configuration config = new Configuration();
 config.add("hive-site.xml");
 HCatClient client = HCatClient.create(config);

 HCatCreateTableDesc desc = new HCatCreateTableDesc();
 desc.setTableName("demo_table");
 desc.setDatabaseName("db1");
 desc.setFileFormat("rcfile".create(config);
 ArrayList<HCatFieldSchema> cols = new ArrayList<HCatFieldSchema>();
 cols.add(new HCatFieldSchema("col1id", Type.INT, "comment1id columns"));
 cols.add(new HCatFieldSchema("col2value", Type.STRING, "comment2id columns"));
 desc.setCols(cols);

 //Validate
 desc.validateCommandDescHCatCreateTableDesc tableDesc = HCatCreateTableDesc.create(db, "testtable", cols).fileFormat("rcfile").build();
 boolean success = client.createTable(desctableDesc);

Discussion Topics