Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Overview

There are presently three ways to issue HCatalog DDL commands:

  1. Command line interface
  2. REST APIs (upcoming)
  3. HiveMetaStore Client

Presently, java developers go through the Hive meta store (HMS) client interface to issue HCatalog DDl commands. Though the HMS client interface is public, it is not intended for public users. According to the hive user mailing list, the HMS client is not a public API and is subject to change in the future. So, it will be a good idea to have a java APIs in HCatalog which will provide a protect users from the changes made to the hive meta store client. Also, the under the covers either the Rest APIS or the hive metastore client can be used to provide end users with the required data.

Design

Image Added

– This document is a work in progress.

Overview

Templeton provides a REST-like web API for HCatalog and related Hadoop components. Developers can make HTTP requests to the Templeton web server to execute HCatalog DDL commands. With the REST APIs in place for HCatalog DDL commands, it is desirable to have a JAVA APIs in HCAT which can help end users to execute DDL commands without using CLI.

...

New Classes

HCatClient

The HCatClient is an abstract class containing all the APIs permitted HCatalog DDL commands. The implementation class will be provided as a configuration property, which will be used by the
"create" method. In this way, the implementation details will be masked to the users.

Code Block

public abstract class HCatClient {
Code Block

/**
 * The abstract class HCatClient containing APIs for HCatalog DDL commands.
 */
public abstract class HCatClient {

    /**
     * Creates an instance of HCatClient.
     *
     * @param conf An instance of configuration.
     * @return An instance of HCatClient.
     */
    public HCatClient create(Configuration conf){
         return null;
    }

    /**
     * Gets the database like.
     *
     * @param regex The regular expression. Providing "*" would retrieve all the names
     *              of the databases.
     * @return The list of all the database names.
     * @throws HCatException
     */
    public abstract List<String> getDatabaseLike(String regex) throws HCatException;

    /**
     * Gets the database.
     *
     * @param dbName The name of the database.
     * @return An instance of HCatDatabaseInfo.
     * @throws HCatException
     */
    public abstract HCatDatabase getDatabase(String dbName) throws HCatException;

    /**
     * Creates the databasean instance of HCatClient.
     *
     * @param dbInfoconf An instance of HCatCreateDBDescconfiguration.
     * @return An true,instance ifof successfulHCatClient.
     * @throws HCatExceptionIOException
     */
    public abstractstatic booleanHCatClient createDatabasecreate(HCatCreateDBDescConfiguration dbInfoconf) throws IOException{
      HCatClient client = HCatUtil.getHCatClient(conf);
     throws HCatException;

  if(client  /**!= null){
     * Deletes a database.
     *client.initialize(conf);
     * @param dbName The}
 name of the database to delete.
  return client;
  * @param ifExists}

 Hive returns an errorabstract ifvoid the database specified does not exist,initialize(Configuration conf) throws HCatException;

     /**
     * Get all existing databases that match the given
    unless ifExists* ispattern. setThe tomatching true.
occurs as per Java regular *expressions
 @param mode This is set*
 to either "restrict" or "cascade". Restrict will* @param databasePattern
     *          java re  removepattern
 the schema if all the* tables@return arelist empty.of Cascadedatabase removesnames
     * @throws HCatException
     */
    public abstract everythingList<String> including data and definitions.listDatabaseNamesByPattern(String pattern) throws HCatException;

    /**
 * @param userGroup The user* groupGets tothe usedatabase.
     *
 @param permissions The permissions string* to@param use.dbName The formatname of is "rwxrw-r-x"the database.
     * @return true, if successfulAn instance of HCatDatabaseInfo.
     * @throws HCatException
     */
    public abstract booleanHCatDatabase deleteDatabasegetDatabase(String dbName,) boolean ifExists, String mode,throws HCatException;

    /**
     * Creates  String userGroup, String permissions) throws HCatException;

the database.
     /**
     * Gets@param thedbInfo tablesAn likeinstance aof pattern specifiedHCatCreateDBDesc.
     * @throws HCatException
     */
  @param dbName Thepublic nameabstract ofvoid the database.createDatabase(HCatCreateDBDesc dbInfo)
     *      @param regex The regular expression. Providing "*" would retrieve all the names
     *throws HCatException;

    /**
     * Drops a database.
     *
     * @param dbName The name of the database theto tabledelete.
     * @return@param AifExists listHive ofreturns allan tableerror namesif matchingthe thedatabase specified pattern. does not exist,
     *      @throws HCatException
     */
    public abstractunless List<String>ifExists getTablesLike(String dbName, String regex)is set to true.
     * @param mode This is set to throws HCatException;

    /**either "restrict" or "cascade". Restrict will
     * Gets    the table.
     *
  remove the schema *if @paramall dbNamethe Thetables nameare ofempty. theCascade database.removes
     * @param tableName The name of the table.
     * @returneverything Anincluding instancedata ofand HCatTableInfodefinitions.
     * @throws HCatException
     */
    public abstract HCatTablevoid getTabledropDatabase(String dbName, boolean ifExists, String tableName)mode) throws HCatException;

    /**
     * Returns all existing tables from the specified database which match throwsthe HCatException;given

     /**
 pattern. The matching occurs *as per CreatesJava theregular tableexpressions.
     * @param dbName
     * @param createTableDesc An instance of HCatCreateTableDesc class.tablePattern
     * @return list true,of iftable successful.names
     * @throws HCatException the h cat exception
     */
    public abstract boolean createTable(HCatCreateTableDesc createTableDesc abstract List<String> listTableNamesByPattern(String dbName, String tablePattern)
            throws HCatException;

    /**
     * CreatesGets the table like an existing table.
     *
     * @param dbName The name of the database.
     * @param existingTblName The name of the existing table.
     * @param newTableName The name of the new table.
     * @param ifExists the if existstableName The name of the table.
     * @param@return isExternalAn Setinstance to "true", if table has be created at a differentof HCatTableInfo.
     * @throws HCatException
     */
    public abstract HCatTable getTable(String dbName, String tableName)
         location  other than default.throws HCatException;

    /**
 * @param location The location* forCreates the table.
     * @return true, if successful
     * @throws@param HCatException
createTableDesc An instance of HCatCreateTableDesc */class.
     public* abstract@throws booleanHCatException createTableLike(String dbName, String existingTblName,the h cat exception
     */
    public abstract  String newTableName, boolean ifExists, boolean isExternal,void createTable(HCatCreateTableDesc createTableDesc)
            String location) throws HCatException;

    /**
     * Delete a Creates the table like an existing table.
     *
     * @param dbName The name of the database.
     * @param tableNameexistingTblName The name of the existing table.
     * @param ifExists Hive returns an error if the database specified does not exist,
     *     newTableName The name of the new table.
     * @param ifNotExists If true, then error related to already table existing is skipped.
     * @param isExternal Set to "true", if table unlesshas ifExistsbe iscreated setat toa true.different
     * @param userGroup The user group to use.
     * @param permissions The permissions string to use. Thelocation formatother is "rwxrw-r-x"than default.
     * @return true, if successful @param location The location for the table.
     * @throws HCatException
     */
    public abstract booleanvoid deleteTablecreateTableLike(String dbName, String tableNameexistingTblName,
            String newTableName, boolean ifExistsifNotExists, Stringboolean userGroupisExternal,
 String permissions)
          String location) throws HCatException;

    /**
     * RenamesDrop a table.
     *
     * @param dbName The name of the database.
     * @param oldName The name of the table to be renamed of the database.
     * @param newNametableName The new name of the table.
     * @param userGroup The user group to use. ifExists Hive returns an error if the database specified does not exist,
     * @param     permissions The permissions string to use. The format is "rwxrw-r-x".
  unless ifExists is *set @returnto true, if successful.
     * @throws HCatException
     */
    public abstract booleanvoid renameTabledropTable(String dbName, String oldNametableName, String newName,
            String userGroup, String permissionsboolean ifExists) throws HCatException;

    /**
     * GetsRenames alla the partitionstable.
     *
     * @param dbName The name of the database.
     * @param tblNameoldName The name of the table to be renamed.
     * @param newName @returnThe Anew listname of partitionthe namestable.
     * @throws HCatException the h cat exception @throws HCatException
     */
    public abstract List<HCatPartition>void getPartitionsrenameTable(String dbName, String tblName)oldName, String newName) throws HCatException;

    /**
     * Gets all throwsthe HCatException;

partitions.
     *
     /**
     * Gets the partition* @param dbName The name of the database.
     * @param tblName The name of the table.
     *
 @return A list of * @param dbName The database namepartition names.
     * @throws @paramHCatException tableNamethe Theh tablecat name.exception
     */
 @param partitionName The partitionpublic name,abstract CommaList<HCatPartition> separated list of col_name='value'.getPartitions(String dbName, String tblName)
     * @return An instance of HCatPartitionInfo.
  throws HCatException;

  * @throws HCatException/**
     */
 Gets   public abstract HCatPartition getPartition(String dbName, String tableName,the partition.
     *
     * @param dbName The database name.
   String partitionName) throws HCatException;

    /*** @param tableName The table name.
     * Adds@param partitionName theThe partition.
 name, Comma separated list  *of col_name='value'.
     * @param partInfo@return An instance of HCatAddPartitionDescHCatPartitionInfo.
     * @return true, if successful @throws HCatException
     */
    public *abstract @throwsHCatPartition HCatException the h cat exceptiongetPartition(String dbName, String tableName,
     */
    public abstract boolean addPartition(HCatAddPartitionDescString partInfopartitionName) throws HCatException;

    /**
     * Adds Deletesthe partition.
     *
     * @param partInfo dbNameAn Theinstance databaseof name.HCatAddPartitionDesc.
     * @throws HCatException the h cat exception
     * @param tableName The table name./
    public abstract void addPartition(HCatAddPartitionDesc partInfo) throws HCatException;

    /**
     * @param partitionNameDrops The partition name, Comma separated list of col_name='value'.partition.
     *
     * @param ifExistsdbName HiveThe returnsdatabase anname.
 error if the partition specified* does@param not exist, unless ifExists is set to truetableName The table name.
     * @param userGrouppartitionName The user group to use partition name, Comma separated list of col_name='value'.
     * @param ifExists permissionsHive Thereturns permissionsan stringerror toif use.the Thepartition formatspecified is "rwxrw-r-x".
     * @return true, if successfuldoes not exist, unless ifExists is set to true.
     * @throws HCatException
     */
    public abstract booleanvoid deletePartitiondropPartition(String dbName, String tableName,
            String partitionName, boolean ifExists, String userGroup,
            String permissions) throws HCatException;

    /**
     * List partitions by filter.
     *
     * @param dbName The database name.
     * @param tblName The table name.
     * @param filter The filter string,
     *    for example "part1 = \"p1_abc\" and part2 <= "\p2_test\"". Filtering can
     *    be done only on string partition keys.
     * @return list of partitions
     * @throws HCatException the h cat exception
     */
    public abstract List<HCatPartition> listPartitionsByFilter(String dbName, String tblName,
            String filter) throws HCatException;

    /**
     * Mark partition for event.
     *
     * @param dbName The database name.
     * @param tblName The table name.
     * @param partKVs the part k vs
     * @param eventType the event type
     * @throws HCatException the h cat exception
     */
    public abstract void markPartitionForEvent(String dbName, String tblName,
            Map<String, String> partKVs, PartitionEventType eventType)
            throws HCatException;

    /**
     * Checks if is partition marked for event.
     *
     * @param dbName the db name
     * @param tblName the tbl name
     * @param partKVs the part k vs
     * @param eventType the event type
     * @return true, if is partition marked for event
     * @throws HCatException the h cat exception
     */
    public abstract boolean isPartitionMarkedForEvent(String dbName, String tblName,
            Map<String, String> partKVs, PartitionEventType eventType)
            throws HCatException;

    /**
     * Gets the delegation token.
     *
     * @param owner the owner
     * @param renewerKerberosPrincipalName the renewer kerberos principal name
     * @return the delegation token
     * @throws HCatException the h cat exception
     */
    public abstract String getDelegationToken(String owner, String renewerKerberosPrincipalName) throws
        HCatException;

    /**
     * Renew delegation token.
     *
     * @param tokenStrForm the token str form
     * @return the long
     * @throws HCatException the h cat exception
     */
    public abstract long renewDelegationToken(String tokenStrForm) throws HCatException;

    /**
     * Cancel delegation token.
     *
     * @param tokenStrForm the token str form
     * @throws HCatException the h cat exception
     */
    public abstract void cancelDelegationToken(String tokenStrForm) throws HCatException;

}
HCatCommandDesc

This is an abstract class that helps in validating user input, building valid command descriptors and queries.

Code Block

    /**
      * TheClose Classthe HCatCommandDeschcatalog containsclient.
 methods which help in validating,
 * building command descriptors and queries.
 */
public abstract class HCatCommandDesc{

    public abstract void validateCommandDesc() throws HCatException;
    abstract String buildQuery() throws HCatException;
    abstract boolean isValidationComplete();

}

...

*
     * @throws HCatException the h cat exception
     */
    public abstract void close() throws HCatException;
HCatCreateTableDesc

This class is a sub class of HCatCommandDesc and will be used by the users to create descriptor and validate it for the "create table" command.
Image Added

HCatCreateDBDesc

This class is a sub class of HCatCommandDesc and will be used by the users to create descriptors and validate it for the "create database" command.

!createdb.png|

HCatAddPartitionDesc

This class is a sub class of HCatCommandDesc and will be used by the users to create descriptor descriptors and validate it for the "create tableadd partition" command.

HCatCreateDBDesc

This class is a sub class of HCatCommandDesc and will be used by the users to create descriptors and validate it for the "create database" command.

HCatAddPartitionDesc

This class is a sub class of HCatCommandDesc and will be used by the users to create descriptos and validate it for the "add partition" command.

HCatTable
HCatDatabase

...

 Image Added

HCatTable

This class encapsulates the table information returned the HCatClient implementation class and provides a uniform view to the user.

Image Added

HCatDatabase

This class encapsulates the database information returned the HCatClient implementation class and provides a uniform view to the user.

Image Added

HCatPartition

This class encapsulates the partition information returned the HCatClient implementation class and provides a uniform view to the user.

Image Added

Usage

Code Block
 Configuration config = new Configuration();
 config.add("hive-site.xml");
 HCatClient client = HCatClient.create(config);

 HCatCreateTableDesc desc = new HCatCreateTableDesc();
 desc.setTableName("demo_table");
 desc.setDatabaseName("db1");
 desc.setFileFormat("rcfile".create(config);
 ArrayList<HCatFieldSchema> cols = new ArrayList<HCatFieldSchema>();
 cols.add(new HCatFieldSchema("col1id", Type.INT, "comment1id columns"));
 cols.add(new HCatFieldSchema("col2value", Type.STRING, "comment2id columns"));
HCatCreateTableDesc tableDesc = descHCatCreateTableDesc.setCols(cols);

 //Validate
 desc.validateCommandDesccreate(db, "testtable", cols).fileFormat("rcfile").build();
 boolean success = client.createTable(desctableDesc);

Discussion Topics