...
Code Block |
---|
ANALYZE TABLE tablename [PARTITION(partcol1[=val1], partcol2[=val2], ...)] COMPUTE STATISTICSSTATISTICS [FOR COLUMNS] -- (Note: Hive 0.10.0 and later.) [NOSCAN]; |
When the user issues that command, he may or may not specify the partition specs. If the user doesn't specify any partition specs, statistics are gathered for the table as well as all the partitions (if any). If certain partition specs are specified, then statistics are gathered for only those partitions. When computing statistics across all partitions, the partition columns still need to be listed.
When the optional parameter NOSCAN is specified, the command won't scan files so that it's supposed to be fast. Instead of all statistics, it just gathers the following statistics:
- Number of files
- Physical size in bytes
Info | ||
---|---|---|
| ||
As of Hive 0.10.0, the optional parameter FOR COLUMNS computes column statistics for all columns in the specified table (and for all partitions if the table is partitioned). To display these statistics, use DESCRIBE FORMATTED [db_name.]table_name.column_name [PARTITION (partition_spec)]. |
Examples
Suppose table Table1 has 4 partitions with the following specs:
...