Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Content Zone
maxLevel4
locationtop
typelist

Describe Database

Info
titleVersion information

As of Hive 0.7.

Code Block
DESCRIBE DATABASE [EXTENDED] db_name;
DESCRIBE SCHEMA [EXTENDED] db_name;     -- (Note: Hive 0.15.0 and later)

DESCRIBE DATABASE shows the name of the database, its comment (if one has been set), and its root location on the filesystem. The uses of SCHEMA and DATABASE are interchangeable – they mean the same thing. DESCRIBE SCHEMA is added in Hive 0.15 (HIVE-8803).

EXTENDED also shows the database properties.

Describe Table/View/Column

There are two formats for the describe table/view/column syntax, depending on whether or not the database is specified.

If the database is not specified, the optional column information is provided after a dot:

Code Block
DESCRIBE [EXTENDED|FORMATTED]  
  table_name[.col_name ( [.field_name] | [.'$elem$'] | [.'$key$'] | [.'$value$'] )* ];
                                        -- (Note: Hive 1.x.x and 0.x.x only. See "Hive 2.0+: New Syntax" below)

If the database is specified, the optional column information is provided after a space:

Code Block
DESCRIBE [EXTENDED|FORMATTED]  
  [db_name.]table_name[ col_name ( [.field_name] | [.'$elem$'] | [.'$key$'] | [.'$value$'] )* ];
                                        -- (Note: Hive 1.x.x and 0.x.x only. See "Hive 2.0+: New Syntax" below)

DESCRIBE shows the list of columns including partition columns for the given table. If the EXTENDED keyword is specified then it will show all the metadata for the table in Thrift serialized form. This is generally only useful for debugging and not for general use. If the FORMATTED keyword is specified, then it will show the metadata in a tabular format.

Note: DESCRIBE EXTENDED shows the number of rows only if statistics were gathered when the data was loaded (see Newly Created Tables), and if the Hive CLI is used instead of a Thrift client or Beeline. HIVE-6285 will address this issue. Although ANALYZE TABLE gathers statistics after the data has been loaded (see Existing Tables), it does not currently provide information about the number of rows.

If a table has a complex column then you can examine the attributes of this column by specifying table_name.complex_col_name (and field_name for an element of a struct, '$elem$' for array element, '$key$' for map key, and '$value$' for map value). You can specify this recursively to explore the complex column type.

For a view, DESCRIBE EXTENDED or FORMATTED can be used to retrieve the view's definition. Two relevant attributes are provided: both the original view definition as specified by the user, and an expanded definition used internally by Hive.

Info
titleVersion information — partition & non-partition columns

In Hive 0.10.0 and earlier, no distinction is made between partition columns and non-partition columns while displaying columns for DESCRIBE TABLE. From Hive 0.12.0 onwards, they are displayed separately.

In Hive 0.13.0 and later, the configuration parameter hive.display.partition.cols.separately lets you use the old behavior, if desired (HIVE-6689). For an example, see the test case in the patch for HIVE-6689.

Info
titleBug fixed in Hive 0.10.0 — database qualifiers

Database qualifiers for table names were introduced in Hive 0.7.0, but they were broken for DESCRIBE until a bug fix in Hive 0.10.0 (HIVE-1977).

Info
titleBug fixed in Hive 0.13.0 — quoted identifiers

Prior to Hive 0.13.0 DESCRIBE did not accept backticks (`) surrounding table identifiers, so DESCRIBE could not be used for tables with names that matched reserved keywords (HIVE-2949 and HIVE-6187). As of 0.13.0, all identifiers specified within backticks are treated literally when the configuration parameter hive.support.quoted.identifiers has its default value of "column" (HIVE-6013). The only exception is that double backticks (``) represent a single backtick character.

Display Column Statistics

Info
titleVersion information

As of Hive 0.14.0; see HIVE-7050 and HIVE-7051. (The FOR COLUMNS option of ANALYZE TABLE is available as of Hive 0.10.0.)

ANALYZE TABLE table_name COMPUTE STATISTICS FOR COLUMNS will compute column statistics for all columns in the specified table (and for all partitions if the table is partitioned). To view the gathered column statistics, the following statements can be used:

Code Block
DESCRIBE FORMATTED [db_name.]table_name column_name;                              -- (Note: Hive 0.14.0 and later)
DESCRIBE FORMATTED [db_name.]table_name column_name PARTITION (partition_spec);   -- (Note: Hive 0.14.0 to 1.x.x)
                                                                                  -- (see "Hive 2.0+: New Syntax" below)

See Statistics in Hive: Existing Tables for more information about the ANALYZE TABLE command.

Describe Partition

There are two formats for the describe partition syntax, depending on whether or not the database is specified.

If the database is not specified, the optional column information is provided after a dot:

Code Block
DESCRIBE [EXTENDED|FORMATTED] table_name[.column_name] PARTITION partition_spec;
                                        -- (Note: Hive 1.x.x and 0.x.x only. See "Hive 2.0+: New Syntax" below)

If the database is specified, the optional column information is provided after a space:

Code Block
DESCRIBE [EXTENDED|FORMATTED] [db_name.]table_name [column_name] PARTITION partition_spec;
                                        -- (Note: Hive 1.x.x and 0.x.x only. See "Hive 2.0+: New Syntax" below)

This statement lists metadata for a given partition. The output is similar to that of DESCRIBE table_name. Presently, the column information associated with a particular partition is not used while preparing plans. As of Hive 1.2 (HIVE-10307), the partition column values specified in partition_spec are type validated, converted and normalized to their column types when hive.typecheck.on.insert is set to true (default). These values can be number literals.

Code Block
languagesql
titleExample:
hive> show partitions part_table;
OK
d=abc


hive> DESCRIBE extended part_table partition (d='abc');
OK
i                       int                                         
d                       string                                      
                 
# Partition Information          
# col_name              data_type               comment             
                 
d                       string                                      
                 
Detailed Partition Information  Partition(values:[abc], dbName:default, tableName:part_table, createTime:1459382234, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:i, type:int, comment:null), FieldSchema(name:d, type:string, comment:null)], location:file:/tmp/warehouse/part_table/d=abc, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:{numFiles=1, COLUMN_STATS_ACCURATE=true, transient_lastDdlTime=1459382234, numRows=1, totalSize=2, rawDataSize=1})   
Time taken: 0.325 seconds, Fetched: 9 row(s)


hive> DESCRIBE formatted part_table partition (d='abc');
OK
# col_name              data_type               comment             
                 
i                       int                                         
                 
# Partition Information          
# col_name              data_type               comment             
                 
d                       string                                      
                 
# Detailed Partition Information                 
Partition Value:        [abc]                    
Database:               default                  
Table:                  part_table               
CreateTime:             Wed Mar 30 16:57:14 PDT 2016     
LastAccessTime:         UNKNOWN                  
Protect Mode:           None                     
Location:               file:/tmp/warehouse/part_table/d=abc     
Partition Parameters:            
        COLUMN_STATS_ACCURATE   true                
        numFiles                1                   
        numRows                 1                   
        rawDataSize             1                   
        totalSize               2                   
        transient_lastDdlTime   1459382234          
                 
# Storage Information            
SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe       
InputFormat:            org.apache.hadoop.mapred.TextInputFormat         
OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       
Compressed:             No                       
Num Buckets:            -1                       
Bucket Columns:         []                       
Sort Columns:           []                       
Storage Desc Params:             
        serialization.format    1                   
Time taken: 0.334 seconds, Fetched: 35 row(s)


Hive 2.0+: Syntax Change

Info
titleHive 2.0+: New syntax
In Hive 2.0 release onward, the describe table command has a syntax change which is backward incompatible. See HIVE-12184 for details.
Code Block
DESCRIBE [EXTENDED | FORMATTED]
    [db_name.]table_name [PARTITION partition_spec] [col_name ( [.field_name] | [.'$elem$'] | [.'$key$'] | [.'$value$'] )* ];

Warning: The new syntax could break current scripts.

  • It no longer accepts DOT separated table_name and column_name. They would have to be SPACE-separated. DB and TABLENAME are DOT-separated. column_name can still contain DOTs for complex datatypes.
  • Optional partition_spec has to appear after the table_name but prior to the optional column_name. In the previous syntax, column_name appears in between table_name and partition_spec.
Code Block
languagesql
titleExamples:
DESCRIBE FORMATTED default.src_table PARTITION (part_col = 100) columnA;
DESCRIBE default.src_thrift lintString.$elem$.myint;

Abort

Table of Content Zone
maxLevel4
locationtop
typelist

Abort Transactions

Info
titleVersion information

As of Hive 2.1.

Code Block
ABORT TRANSACTIONS TransactionID [, ...];

ABORT TRANSACTIONS cleans up the specified transaction ids from the Hive metastore so that users do not need to interact with metastore directly in order to remove the dangling or failed transactions. ABORT TRANSACTIONS is added in Hive 2.1 (HIVE-12634).

Code Block
languagesql
titleExamples:
ABORT TRANSACTIONS 0000007, 0000008, 0000010, 0000015;

This command can be used together with SHOW TRANSACTIONS. The latter can help figure out the candidate transaction ids to be cleaned up.

HCatalog and WebHCat DDL

For information about DDL in HCatalog and WebHCat, see:

...