Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • CREATE DATABASE/SCHEMA, TABLE, VIEW, FUNCTION, INDEX
  • DROP DATABASE/SCHEMA, TABLE, VIEW, INDEX
  • TRUNCATE TABLE
  • ALTER DATABASE/SCHEMA, TABLE, VIEW
  • MSCK REPAIR TABLE (or ALTER TABLE RECOVER PARTITIONS)
  • SHOW DATABASES/SCHEMAS, TABLES, TBLPROPERTIES, VIEWS, PARTITIONS, FUNCTIONS, INDEX[ES], COLUMNS, CREATE TABLE
  • DESCRIBE DATABASE/SCHEMA, table_name, view_name, materialized_view_name

PARTITION statements are usually options of TABLE statements, except for SHOW PARTITIONS.

...

Table of Content Zone
locationtop
typelist


Info
titleVersion information

View support is only available in Hive 0.6 and later.

Create View

Code Block
CREATE VIEW [IF NOT EXISTS] [db_name.]view_name [(column_name [COMMENT column_comment], ...) ]
  [COMMENT view_comment]
  [TBLPROPERTIES (property_name = property_value, ...)]
  AS SELECT ...;

CREATE VIEW creates a view with the given name. An error is thrown if a table or view with the same name already exists. You can use IF NOT EXISTS to skip the error.

If no column names are supplied, the names of the view's columns will be derived automatically from the defining SELECT expression. (If the SELECT contains unaliased scalar expressions such as x+y, the resulting view column names will be generated in the form _C0, _C1, etc.) When renaming columns, column comments can also optionally be supplied. (Comments are not automatically inherited from underlying columns.)

A CREATE VIEW statement will fail if the view's defining SELECT expression is invalid.

Note that a view is a purely logical object with no associated storage (except for materialized views, which Hive supports starting in release 2.3.0. ) When a query references a view, the view's definition is evaluated in order to produce a set of rows for further processing by the query. (This is a conceptual description; in fact, as part of query optimization, Hive may combine the view's definition with the query's, e.g. pushing filters from the query down into the view.)

A view's schema is frozen at the time the view is created; subsequent changes to underlying tables (e.g. adding a column) will not be reflected in the view's schema. If an underlying table is dropped or changed in an incompatible fashion, subsequent attempts to query the invalid view will fail.

Views are read-only and may not be used as the target of LOAD/INSERT/ALTER. For changing metadata, see ALTER VIEW.

A view may contain ORDER BY and LIMIT clauses. If a referencing query also contains these clauses, the query-level clauses are evaluated after the view clauses (and after any other operations in the query). For example, if a view specifies LIMIT 5, and a referencing query is executed as (select * from v LIMIT 10), then at most 5 rows will be returned.

Starting with Hive 0.13.0, the view's select statement can include one or more common table expressions (CTEs) as shown in the SELECT syntax. For examples of CTEs in CREATE VIEW statements, see Common Table Expression.

Code Block
titleExample:
CREATE VIEW onion_referrers(url COMMENT 'URL of Referring page')
  COMMENT 'Referrers to The Onion website'
  AS
  SELECT DISTINCT referrer_url
  FROM page_view
  WHERE page_url='http://www.theonion.com';

Use SHOW CREATE TABLE to display the CREATE VIEW statement that created a view. As of Hive 2.2.0, SHOW VIEWS displays a list of views in a database.

Info
titleVersion Information

Originally, the file format for views was hard coded as SequenceFile. Hive 2.1.0 (HIVE-13736) made views follow the same defaults as tables and indexes using the hive.default.fileformat and hive.default.fileformat.managed properties.

Drop View

Code Block
DROP VIEW [IF EXISTS] [db_name.]view_name;

DROP VIEW removes metadata for the specified view. (It is illegal to use DROP TABLE on a view.)

When dropping a view referenced by other views, no warning is given (the dependent views are left dangling as invalid and must be dropped or recreated by the user).

In Hive 0.7.0 or later, DROP returns an error if the view doesn't exist, unless IF EXISTS is specified or the configuration variable hive.exec.drop.ignorenonexistent is set to true.

Code Block
titleExample:
DROP VIEW onion_referrers;

Alter View Properties

Code Block
ALTER VIEW [db_name.]view_name SET TBLPROPERTIES table_properties;

table_properties:
  : (property_name = property_value, property_name = property_value, ...)

As with ALTER TABLE, you can use this statement to add your own metadata to a view.

Alter View As Select

Info
titleVersion information

As of Hive 0.11.


Code Block
ALTER VIEW [db_name.]view_name AS select_statement;

Alter View As Select changes the definition of a view, which must exist. The syntax is similar to that for CREATE VIEW and the effect is the same as for CREATE OR REPLACE VIEW.

Note: The view must already exist, and if the view has partitions, it could not be replaced by Alter View As Select.

Create/Drop/

...

Alter Materialized View

This section provides an introduction to Hive materialized views syntax. More information about materialized view support and usage in Hive can be found here.

Table of Content Zone
locationtop
typelist


Info
titleVersion information

As of Hive 0.7.

Indexing Is Removed since 3.0! See Indexes design document

This section provides a brief introduction to Hive indexes, which are documented more fully here:

Materialized view support is only available in Hive 3.0 and later.

Create Materialized View

Code Block
CREATE MATERIALIZED VIEW [IF NOT EXISTS] [db_name.]materialized_view_name
  [DISABLE REWRITE]
  [COMMENT materialized_view_comment]
  [PARTITIONED ON (col_name, ...)]
  [CLUSTERED ON (col_name, ...) | DISTRIBUTED ON (col_name, ...) SORTED ON (col_name, ...)]
  [
    [ROW FORMAT row_format]
    [STORED AS file_format]
      | STORED BY 'storage.handler.class.name' [WITH SERDEPROPERTIES (...)]
  ]
  [LOCATION hdfs_path]
  [TBLPROPERTIES (property_name=property_value, ...)]
AS SELECT ...;

CREATE MATERIALIZED VIEW creates a view with the given name. An error is thrown if a table, view or materialized view with the same name already exists. You can use IF NOT EXISTS to skip the error.

The names of the materialized view's columns will be derived automatically from the defining SELECT expression.

A CREATE MATERIALIZED VIEW statement will fail if the view's defining SELECT expression is invalid.

Drop Materialized View

Code Block
DROP MATERIALIZED VIEW [db_name.]materialized_view_name;

DROP MATERIALIZED VIEW removes metadata and data for this materialized view.

Alter Materialized View

Once a materialized view has been created, the optimizer will be able to exploit its definition semantics to automatically rewrite incoming queries using materialized views, and hence, accelerate query execution. 

Users can selectively enable/disable materialized views for rewriting. Recall that, by default, materialized views are enabled for rewriting at creation time. To alter that behavior, the following statement can be used:

Code Block
ALTER MATERIALIZED VIEW [db_name.]materialized_view_name ENABLE|DISABLE REWRITE;


Create/Drop/Alter Index

Info
titleVersion information

As of Hive 0.7.

Indexing Is Removed since 3.0! See Indexes design document

This section provides a brief introduction to Hive indexes, which are documented more fully here:

In Hive 0.12.0 and earlier releases, the index name is case-sensitive for CREATE In Hive 0.12.0 and earlier releases, the index name is case-sensitive for CREATE INDEX and DROP INDEX statements. However, ALTER INDEX requires an index name that was created with lowercase letters (see HIVE-2752). This bug is fixed in Hive 0.13.0 by making index names case-insensitive for all HiveQL statements. For releases prior to 0.13.0, the best practice is to use lowercase letters for all index names.

...

Table of Content Zone
maxLevel5
locationtop
typelist

These statements provide a way to query the Hive metastore for existing data and metadata accessible to this Hive system.

Show Databases

Code Block
SHOW (DATABASES|SCHEMAS) [LIKE 'identifier_with_wildcards'];

SHOW DATABASES or SHOW SCHEMAS lists all of the databases defined in the metastore. The uses of SCHEMAS and DATABASES are interchangeable – they mean the same thing.

The optional LIKE clause allows the list of databases to be filtered using a regular expression. Wildcards in the regular expression can only be '*' for any character(s) or '|' for a choice. Examples are 'employees', 'emp*', 'emp*|*ees', all of which will match the database named 'employees'.

Anchor
Show Tables/Partitions/Indexes
Show Tables/Partitions/Indexes

Show Tables/Views/Materialized Views/Partitions/Indexes

Show Tables

Code Block
SHOW TABLES [IN database_name] ['identifier_with_wildcards'];

SHOW TABLES lists all the base tables and views in the current database (or the one explicitly named using the IN clause) with names matching the optional regular expression. Wildcards in the regular expression can only be '*' for any character(s) or '|' for a choice. Examples are 'page_view', 'page_v*', '*view|page*', all which will match the 'page_view' table. Matching tables are listed in alphabetical order. It is not an error if there are no matching tables found in metastore. If no regular expression is given then all tables in the selected database are listed.

Show Views

Info
titleVersion information

Introduced in Hive 2.2.0 via HIVE-14558.


Code Block
SHOW VIEWS [IN/FROM database_name] [LIKE 'pattern_with_wildcards'];

SHOW VIEWS lists all the views in the current database (or the one explicitly named using the IN or FROM clause) with names matching the optional regular expression. Wildcards in the regular expression can only be '*' for any character(s) or '|' for a choice. Examples are 'page_view', 'page_v*', '*view|page*', all which will match the 'page_view' view. Matching views are listed in alphabetical order. It is not an error if no matching views are found in metastore. If no regular expression is given then all views in the selected database are listed.

Code Block
languagesql
titleExamples
SHOW VIEWS;                                -- show all views in the current database
SHOW VIEWS 'test_*';                       -- show all views that start with "test_"
SHOW VIEWS '*view2';                       -- show all views that end in "view2"
SHOW VIEWS LIKE 'test_view1|test_view2';
SHOW VIEWS LIKE 'test_view1|test_view2';   -- show views named either "test_view1" or "test_view2"
SHOW VIEWS FROM test1;                     -- show views namedfrom either "test_view1" or "test_view2"database test1
SHOW VIEWS FROMIN test1;                       -- show views from database test1 (FROM and IN are same) 
SHOW VIEWS IN test1 "test_*";              -- show views from database test2 that start with "test_"

Show Materialized Views

Code Block
SHOW MATERIALIZED VIEWS [IN/FROM database_name]  -- show views from database test1 (FROM and IN are same) 
SHOW VIEWS IN test1 "test_*";              -- show views from database test2 that start with "test_"[LIKE 'pattern_with_wildcards’];

SHOW MATERIALIZED VIEWS lists all the views in the current database (or the one explicitly named using the IN or FROM clause) with names matching the optional regular expression. It also shows additional information about the materialized view, e.g., whether rewriting is enabled, and the refresh mode for the materialized view. Wildcards in the regular expression can only be '*' for any character(s) or '|' for a choice. If no regular expression is given then all materialized views in the selected database are listed.

Show Partitions

Code Block
SHOW PARTITIONS table_name;

SHOW PARTITIONS lists all the existing partitions for a given base table. Partitions are listed in alphabetical order.

Info
titleVersion information

As of Hive 0.6, SHOW PARTITIONS can filter the list of partitions as shown below.

It is also possible to specify parts of a partition specification to filter the resulting list.

Code Block
titleExamples:
SHOW PARTITIONS table_name PARTITION(ds='2010-03-03');            -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name PARTITION(hr='12');                    -- (Note: Hive 0.6 and later)
SHOW PARTITIONS table_name PARTITION(ds='2010-03-03', hr='12');   -- (Note: Hive 0.6 and later)


Info
titleVersion information

Starting with Hive 0.13.0, SHOW PARTITIONS can specify a database (HIVE-5912).


Code Block
SHOW PARTITIONS [db_name.]table_name [PARTITION(partition_spec)];   -- (Note: Hive 0.13.0 and later)


Code Block
titleExample:
SHOW PARTITIONS databaseFoo.tableBar PARTITION(ds='2010-03-03', hr='12');   -- (Note: Hive 0.13.0 and later)

Show Table/Partition Extended

Code Block
SHOW TABLE EXTENDED [IN|FROM database_name] LIKE 'identifier_with_wildcards' [PARTITION(partition_spec)];

SHOW TABLE EXTENDED will list information for all tables matching the given regular expression. Users cannot use regular expression for table name if a partition specification is present. This command's output includes basic table information and file system information like totalNumberFiles, totalFileSize, maxFileSize, minFileSize,lastAccessTime, and lastUpdateTime. If partition is present, it will output the given partition's file system information instead of table's file system information.


Code Block
languagesql
titleExample
hive> show table extended like part_table;
OK
tableName:part_table
owner:thejas
location:file:/tmp/warehouse/part_table
inputformat:org.apache.hadoop.mapred.TextInputFormat
outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
columns:struct columns { i32 i}
partitioned:true
partitionColumns:struct partition_columns { string d}
totalNumberFiles:1
totalFileSize:2
maxFileSize:2
minFileSize:2
lastAccessTime:0
lastUpdateTime:1459382233000

Show Table Properties

Info
titleVersion information

As of Hive 0.10.0.


Code Block
SHOW TBLPROPERTIES tblname;
SHOW TBLPROPERTIES tblname("foo");

The first form lists all of the table properties for the table in question, one per row separated by tabs. The second form of the command prints only the value for the property that's being asked for.

For more information, see the TBLPROPERTIES clause in Create Table above.

Show Create Table

Info
titleVersion information

As of Hive 0.10.


Code Block
SHOW CREATE TABLE ([db_name.]table_name|view_name);

SHOW CREATE TABLE shows the CREATE TABLE statement that creates a given table, or the CREATE VIEW statement that creates a given view.

Show Indexes

Info
titleVersion information

As of Hive 0.7.

Indexing Is Removed since 3.0! See Indexes design document


Code Block
SHOW [FORMATTED] (INDEX|INDEXES) ON table_with_index [(FROM|IN) db_name];

SHOW INDEXES shows all of the indexes on a certain column, as well as information about them: index name, table name, names of the columns used as keys, index table name, index type, and comment. If the FORMATTED keyword is used, then column titles are printed for each column.

Show Columns

Info
titleVersion information

As of Hive 0.10.


Code Block
SHOW COLUMNS (FROM|IN) table_name [(FROM|IN) db_name];

SHOW COLUMNS shows all the columns in a table including partition columns.


Info
titleVersion information
SHOW COLUMNS (FROM|IN) table_name [(FROM|IN) db_name]  [ LIKE 'pattern_with_wildcards'];

Added in Hive 3.0 by HIVE-18373.

SHOW COLUMNS lists all the columns in the table with names matching the optional regular expression. Wildcards in the regular expression can only be '*' for any character(s) or '|' for a choice. Examples are 'cola', 'col*', '*a|col*', all which will match the 'cola' column. Matching columns are listed in alphabetical order. It is not an error if no matching columns are found in table. If no regular expression is given then all columns in the selected table are listed.


Code Block
languagesql
titleExamples
-- SHOW COLUMNS
CREATE DATABASE test_db;
USE test_db;
CREATE TABLE foo(col1 INT, col2 INT, col3 INT, cola INT, colb INT, colc INT, a INT, b INT, c INT);
 
-- SHOW COLUMNS basic syntax
SHOW COLUMNS FROM foo;                            -- show all column in foo
SHOW COLUMNS FROM foo "*";                        -- show all column in foo
SHOW COLUMNS IN foo "col*";                       -- show columns in foo starting with "col"                 OUTPUT col1,col2,col3,cola,colb,colc
SHOW COLUMNS FROM foo '*c';                       -- show columns in foo ending with "c"                     OUTPUT c,colc
SHOW COLUMNS FROM foo LIKE "col1|cola";           -- show columns in foo either col1 or cola                 OUTPUT col1,cola
SHOW COLUMNS FROM foo FROM test_db LIKE 'col*';   -- show columns in foo starting with "col"                 OUTPUT col1,col2,col3,cola,colb,colc
SHOW COLUMNS IN foo IN test_db LIKE 'col*';       -- show columns in foo starting with "col" (FROM/IN same)  OUTPUT col1,col2,col3,cola,colb,colc
 
-- Non existing column pattern resulting in no match
SHOW COLUMNS IN foo "nomatch*";
SHOW COLUMNS IN foo "col+";                       -- + wildcard not supported
SHOW COLUMNS IN foo "nomatch";


Show Functions

Code Block
SHOW FUNCTIONS [LIKE "<pattern>"];

SHOW FUNCTIONS lists all the user defined and builtin functions, filtered by the the regular expression if specified with LIKE.

Show Granted Roles and Privileges

Hive deprecated authorization mode / Legacy Mode has information about these SHOW statements:

In Hive 0.13.0 and later releases, SQL standard based authorization has these SHOW statements:

Show Locks

Code Block
languagetext
SHOW LOCKS <table_name>;
SHOW LOCKS <table_name> EXTENDED;
SHOW LOCKS <table_name> PARTITION (<partition_spec>);
SHOW LOCKS <table_name> PARTITION (<partition_spec>) EXTENDED;
SHOW LOCKS (DATABASE|SCHEMA) database_name;     -- (Note: Hive 0.13.0 and later; SCHEMA added in Hive 0.14.0)

SHOW LOCKS displays the locks on a table or partition. See Hive Concurrency Model for information about locks.

SHOW LOCKS (DATABASE|SCHEMA) is supported from Hive 0.13 for DATABASE (see HIVE-2093) and Hive 0.14 for SCHEMA (see HIVE-6601). SCHEMA and DATABASE are interchangeable – they mean the same thing.

When Hive transactions are being used, SHOW LOCKS returns this information (see HIVE-6460):

  • database name
  • table name
  • partition name (if the table is partitioned)
  • the state the lock is in, which can be:
    • "acquired" – the requestor holds the lock
    • "waiting" – the requestor is waiting for the lock
    • "aborted" – the lock has timed out but has not yet been cleaned up
  • Id of the lock blocking this one, if this lock is in "waiting" state
  • the type of lock, which can be:
    • "exclusive" – no one else can hold the lock at the same time (obtained mostly by DDL operations such as drop table)
    • "shared_read" – any number of other shared_read locks can lock the same resource at the same time (obtained by reads; confusingly, an insert operation also obtains a shared_read lock)
    • "shared_write" – any number of shared_read locks can lock the same resource at the same time, but no other shared_write locks are allowed (obtained by update and delete)
  • ID of the transaction this lock is associated with, if there is one
  • last time the holder of this lock sent a heartbeat indicating it was still alive
  • the time the lock was acquired, if it has been acquired
  • Hive user who requested the lock
  • host the user is running on
  • agent info a string that helps identify the entity that issued the lock request. For a SQL client this is the query ID, for streaming client it may be Storm bolt ID for example.

Show Conf

Info
titleVersion information

As of Hive 0.14.0.


Code Block
languagetext
SHOW CONF <configuration_name>;

SHOW CONF returns a description of the specified configuration property.

  • default value
  • required type
  • description

Note that SHOW CONF does not show the current value of a configuration property. For current property settings, use the "set" command in the CLI or a HiveQL script (see Commands) or in Beeline (see Beeline Hive Commands).

Show Transactions

Info
titleVersion information

As of Hive 0.13.0 (see Hive Transactions).


Code Block
languagetext
SHOW TRANSACTIONS;

SHOW TRANSACTIONS is for use by administrators when Hive transactions are being used. It returns a list of all currently open and aborted transactions in the system, including this information:

  • transaction ID
  • transaction state
  • user who started the transaction
  • machine where the transaction was started
  • timestamp when the transaction was started (as of Hive 2.2.0)
  • timestamp for last heartbeat (as of Hive 2.2.0 )

Show Compactions

Info
titleVersion information

As of Hive 0.13.0 (see Hive Transactions).


Code Block
languagetext
SHOW COMPACTIONS;

SHOW COMPACTIONS returns a list of all tables and partitions currently being compacted or scheduled for compaction when Hive transactions are being used, including this information:

  • "CompactionId" - unique internal id (As of Hive 3.0)
  • "Database" - Hive database name
  • "Table" - table name
  • "Partition" - partition name (if the table is partitioned)
  • "Type" - whether it is a major or minor compaction
  • "State" - the state the compaction is in, which can be:
    • "initiated" – waiting in the queue to be compacted
    • "working" – being compacted
    • "ready for cleaning" – the compaction has been done and the old files are scheduled to be cleaned
    • "failed"the job failed. The metastore log will have more detail.
    • "succeeded"A-ok
    • "attempted"initiator attempted to schedule a compaction but failed. The metastore log will have more information.
  • "Worker" - thread ID of the worker thread doing the compaction (only if in working state)
  • "Start Time" - the time at which the compaction started (only if in working or ready for cleaning state)
  • "Duration(ms)" - time this compaction took (As of Hive 2.2 )
  • "HadoopJobId" - Id of the submitted Hadoop job (As of Hive 2.2)

Compactions are initiated automatically, but can also be initiated manually with an ALTER TABLE COMPACT statement.

...

Table of Content Zone
maxLevel4
locationtop
typelist

Describe Database

Info
titleVersion information

As of Hive 0.7.


Code Block
DESCRIBE DATABASE [EXTENDED] db_name;
DESCRIBE SCHEMA [EXTENDED] db_name;     -- (Note: Hive 1.1.0 and later)

DESCRIBE DATABASE shows the name of the database, its comment (if one has been set), and its root location on the filesystem. The uses of SCHEMA and DATABASE are interchangeable – they mean the same thing. DESCRIBE SCHEMA is added in Hive 1.1.0 (HIVE-8803).

EXTENDED also shows the database properties.

Describe Table/View/Materialized View/Column

There are two formats for the describe table/view/materialized view/column syntax, depending on whether or not the database is specified.

If the database is not specified, the optional column information is provided after a dot:

Code Block
DESCRIBE [EXTENDED|FORMATTED]  
  table_name[.col_name ( [.field_name] | [.'$elem$'] | [.'$key$'] | [.'$value$'] )* ];
                                        -- (Note: Hive 1.x.x and 0.x.x only. See "Hive 2.0+: New Syntax" below)

If the database is specified, the optional column information is provided after a space:

Code Block
DESCRIBE [EXTENDED|FORMATTED]  
  [db_name.]table_name[ col_name ( [.field_name] | [.'$elem$'] | [.'$key$'] | [.'$value$'] )* ];
                                        -- (Note: Hive 1.x.x and 0.x.x only. See "Hive 2.0+: New Syntax" below)

DESCRIBE shows the list of columns including partition columns for the given table. If the EXTENDED keyword is specified then it will show all the metadata for the table in Thrift serialized form. This is generally only useful for debugging and not for general use. If the FORMATTED keyword is specified, then it will show the metadata in a tabular format.

Note: DESCRIBE EXTENDED shows the number of rows only if statistics were gathered when the data was loaded (see Newly Created Tables), and if the Hive CLI is used instead of a Thrift client or Beeline. HIVE-6285 will address this issue. Although ANALYZE TABLE gathers statistics after the data has been loaded (see Existing Tables), it does not currently provide information about the number of rows.

If a table has a complex column then you can examine the attributes of this column by specifying table_name.complex_col_name (and field_name for an element of a struct, '$elem$' for array element, '$key$' for map key, and '$value$' for map value). You can specify this recursively to explore the complex column type.

For a view, DESCRIBE EXTENDED or FORMATTED can be used to retrieve the view's definition. Two relevant attributes are provided: both the original view definition as specified by the user, and an expanded definition used internally by Hive.

For materialized views, DESCRIBE EXTENDED or FORMATTED provides additional information on whether rewriting is enabled and whether the given materialized view is considered to be up-to-date for automatic rewriting with respect to the data in the source tables that it uses.

Info
titleVersion information — partition & non-partition columns

In Hive 0.10.0 and earlier, no distinction is made between partition columns and non-partition columns while displaying columns for DESCRIBE TABLE. From Hive 0.12.0 onwards, they are displayed separately.

In Hive 0.13.0 and later, the configuration parameter hive.display.partition.cols.separately lets you use the old behavior, if desired (HIVE-6689). For an example, see the test case in the patch for HIVE-6689.


Info
titleBug fixed in Hive 0.10.0 — database qualifiers

Database qualifiers for table names were introduced in Hive 0.7.0, but they were broken for DESCRIBE until a bug fix in Hive 0.10.0 (HIVE-1977).


Info
titleBug fixed in Hive 0.13.0 — quoted identifiers

Prior to Hive 0.13.0 DESCRIBE did not accept backticks (`) surrounding table identifiers, so DESCRIBE could not be used for tables with names that matched reserved keywords (HIVE-2949 and HIVE-6187). As of 0.13.0, all identifiers specified within backticks are treated literally when the configuration parameter hive.support.quoted.identifiers has its default value of "column" (HIVE-6013). The only exception is that double backticks (``) represent a single backtick character.

Display Column Statistics

Info
titleVersion information

As of Hive 0.14.0; see HIVE-7050 and HIVE-7051. (The FOR COLUMNS option of ANALYZE TABLE is available as of Hive 0.10.0.)

ANALYZE TABLE table_name COMPUTE STATISTICS FOR COLUMNS will compute column statistics for all columns in the specified table (and for all partitions if the table is partitioned). To view the gathered column statistics, the following statements can be used:

Code Block
DESCRIBE FORMATTED [db_name.]table_name column_name;                              -- (Note: Hive 0.14.0 and later)
DESCRIBE FORMATTED [db_name.]table_name column_name PARTITION (partition_spec);   -- (Note: Hive 0.14.0 to 1.x.x)
                                                                                  -- (see "Hive 2.0+: New Syntax" below)

See Statistics in Hive: Existing Tables for more information about the ANALYZE TABLE command.

Describe Partition

There are two formats for the describe partition syntax, depending on whether or not the database is specified.

If the database is not specified, the optional column information is provided after a dot:

Code Block
DESCRIBE [EXTENDED|FORMATTED] table_name[.column_name] PARTITION partition_spec;
                                        -- (Note: Hive 1.x.x and 0.x.x only. See "Hive 2.0+: New Syntax" below)

If the database is specified, the optional column information is provided after a space:

Code Block
DESCRIBE [EXTENDED|FORMATTED] [db_name.]table_name [column_name] PARTITION partition_spec;
                                        -- (Note: Hive 1.x.x and 0.x.x only. See "Hive 2.0+: New Syntax" below)

This statement lists metadata for a given partition. The output is similar to that of DESCRIBE table_name. Presently, the column information associated with a particular partition is not used while preparing plans. As of Hive 1.2 (HIVE-10307), the partition column values specified in partition_spec are type validated, converted and normalized to their column types when hive.typecheck.on.insert is set to true (default). These values can be number literals.

Code Block
languagesql
titleExample:
hive> show partitions part_table;
OK
d=abc


hive> DESCRIBE extended part_table partition (d='abc');
OK
i                       int                                         
d                       string                                      
                 
# Partition Information          
# col_name              data_type               comment             
                 
d                       string                                      
                 
Detailed Partition Information  Partition(values:[abc], dbName:default, tableName:part_table, createTime:1459382234, lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:i, type:int, comment:null), FieldSchema(name:d, type:string, comment:null)], location:file:/tmp/warehouse/part_table/d=abc, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), parameters:{numFiles=1, COLUMN_STATS_ACCURATE=true, transient_lastDdlTime=1459382234, numRows=1, totalSize=2, rawDataSize=1})   
Time taken: 0.325 seconds, Fetched: 9 row(s)


hive> DESCRIBE formatted part_table partition (d='abc');
OK
# col_name              data_type               comment             
                 
i                       int                                         
                 
# Partition Information          
# col_name              data_type               comment             
                 
d                       string                                      
                 
# Detailed Partition Information                 
Partition Value:        [abc]                    
Database:               default                  
Table:                  part_table               
CreateTime:             Wed Mar 30 16:57:14 PDT 2016     
LastAccessTime:         UNKNOWN                  
Protect Mode:           None                     
Location:               file:/tmp/warehouse/part_table/d=abc     
Partition Parameters:            
        COLUMN_STATS_ACCURATE   true                
        numFiles                1                   
        numRows                 1                   
        rawDataSize             1                   
        totalSize               2                   
        transient_lastDdlTime   1459382234          
                 
# Storage Information            
SerDe Library:          org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe       
InputFormat:            org.apache.hadoop.mapred.TextInputFormat         
OutputFormat:           org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat       
Compressed:             No                       
Num Buckets:            -1                       
Bucket Columns:         []                       
Sort Columns:           []                       
Storage Desc Params:             
        serialization.format    1                   
Time taken: 0.334 seconds, Fetched: 35 row(s)


Hive 2.0+: Syntax Change

Info
titleHive 2.0+: New syntax
In Hive 2.0 release onward, the describe table command has a syntax change which is backward incompatible. See HIVE-12184 for details.


Code Block
DESCRIBE [EXTENDED | FORMATTED]
    [db_name.]table_name [PARTITION partition_spec] [col_name ( [.field_name] | [.'$elem$'] | [.'$key$'] | [.'$value$'] )* ];

Warning: The new syntax could break current scripts.

  • It no longer accepts DOT separated table_name and column_name. They would have to be SPACE-separated. DB and TABLENAME are DOT-separated. column_name can still contain DOTs for complex datatypes.
  • Optional partition_spec has to appear after the table_name but prior to the optional column_name. In the previous syntax, column_name appears in between table_name and partition_spec.
Code Block
languagesql
titleExamples:
DESCRIBE FORMATTED default.src_table PARTITION (part_col = 100) columnA;
DESCRIBE default.src_thrift lintString.$elem$.myint;


...