...
Table of Content Zone | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Alter table statements enable you to change the structure of an existing table. You can add columns/partitions, change SerDe, add table and SerDe properties, or rename the table itself. Similarly, alter table partition statements allow you change the properties of a specific partition in the named table. Alter TableRename Table
This statement lets you change the name of a table to a different name. As of version 0.6, a rename on a managed table moves its HDFS location as well. (Older Hive versions just renamed the table in the metastore without moving the HDFS location.) Alter Table Properties
You can use this statement to add your own metadata to the tables. Currently last_modified_user, last_modified_time properties are automatically added and managed by Hive. Users can add their own properties to this list. You can do DESCRIBE EXTENDED TABLE to get this information. Alter Table CommentTo change the comment of a table you have to change the
Add SerDe Properties
These statements enable you to change a table's SerDe or add user-defined metadata to the table's SerDe object. The SerDe properties are passed to the table's SerDe when it is being initialized by Hive to serialize and deserialize data. So users can store any information required for their custom SerDe here. Refer to the SerDe documentation and Hive SerDe in the Developer Guide for more information, and see Row Format, Storage Format, and SerDe above for details about setting a table's SerDe and SERDEPROPERTIES in a CREATE TABLE statement. Note that both
Alter Table Storage Properties
These statements change the table's physical storage properties. NOTE: These commands will only modify Hive's metadata, and will NOT reorganize or reformat existing data. Users should make sure the actual data layout conforms with the metadata definition. Alter Table Skewed or Stored as Directories
A table's SKEWED and STORED AS DIRECTORIES options can be changed with ALTER TABLE statements. See Skewed Tables above for the corresponding CREATE TABLE syntax. Alter Table Skewed
The STORED AS DIRECTORIES option determines whether a skewed table uses the list bucketing feature, which creates subdirectories for skewed values. Alter Table Not Skewed
The NOT SKEWED option makes the table non-skewed and turns off the list bucketing feature (since a list-bucketing table is always skewed). This affects partitions created after the ALTER statement, but has no effect on partitions created before the ALTER statement. Alter Table Not Stored as Directories
This turns off the list bucketing feature, although the table remains skewed. Alter Table Set Skewed Location
This changes the location map for list bucketing. Additional Alter Table StatementsSee Alter Either Table or Partition below for more DDL statements that alter tables. Alter PartitionPartitions can be added, renamed, exchanged (moved), dropped, or (un)archived by using the PARTITION clause in an ALTER TABLE statement, as described below. To make the metastore aware of partitions that were added directly to HDFS, you can use the metastore check command (MSCK) or on Amazon EMR you can use the RECOVER PARTITIONS option of ALTER TABLE. See Alter Either Table or Partition below for more ways to alter partitions.
Add Partitions
You can use ALTER TABLE ADD PARTITION to add partitions to a table. Partition values should be quoted only if they are strings. The location must be a directory inside of which data files reside. (ADD PARTITION changes the table metadata, but does not load data. If the data does not exist in the partition's location, queries will not return any results.) An error is thrown if the partition_spec for the table already exists. You can use IF NOT EXISTS to skip the error.
Specifically, the following example will FAIL silently and without error in Hive 0.7, and all queries will go only to dt='2008-08-08' partition, no matter which partition you specify.
In Hive 0.8 and later, you can add multiple partitions in a single ALTER TABLE statement as shown in the previous example. In Hive 0.7, if you want to add many partitions you should use the following form:
Dynamic Partitions Partitions can be added to a table dynamically, using a Hive INSERT statement (or a Pig STORE statement). See these documents for details and examples:
Rename Partition
This statement lets you change the value of a partition column. One of use cases is that you can use this statement to normalize your legacy partition column value to conform to its type. In this case, the type conversion and normalization are not enabled for the column values in old partition_spec even with property hive.typecheck.on.insert set to true (default) which allows you to specify any legacy data in form of string in the old partition_spec. Exchange PartitionPartitions can be exchanged (moved) between tables.
This statement lets you move the data in a partition from a table to another table that has the same schema and does not already have that partition. Recover Partitions (MSCK REPAIR TABLE)Hive stores a list of partitions for each table in its metastore. If, however, new partitions are directly added to HDFS (say by using However, users can run a metastore check command with the repair table option:
which will add metadata about partitions to the Hive metastore for partitions for which such metadata doesn't already exist. In other words, it will add any partitions that exist on HDFS but not in metastore to the metastore. See HIVE-874 for more details. The equivalent command on Amazon Elastic MapReduce (EMR)'s version of Hive is:
Starting with Hive 1.3, MSCK will throw exceptions if directories with disallowed characters in partition values are found on HDFS. Use hive.msck.path.validation setting on the client to alter this behavior; "skip" will simply skip the directories. "ignore" will try to create partitions anyway (old behavior). This may or may not work. Drop Partitions
You can use ALTER TABLE DROP PARTITION to drop a partition for a table. This removes the data and metadata for this partition. The data is actually moved to the .Trash/Current directory if Trash is configured, unless PURGE is specified, but the metadata is completely lost (see Drop Table above).
For tables that are protected by NO_DROP CASCADE, you can use the predicate IGNORE PROTECTION to drop a specified partition or set of partitions (for example, when splitting a table between two Hadoop clusters):
The above command will drop that partition regardless of protection stats.
If PURGE is specified, the partition data does not go to the .Trash/Current directory and so cannot be retrieved in the event of a mistaken DROP:
The purge option can also be specified with the table property auto.purge (see Create Table above). In Hive 0.7.0 or later, DROP returns an error if the partition doesn't exist, unless IF EXISTS is specified or the configuration variable hive.exec.drop.ignorenonexistent is set to true.
(Un)Archive Partition
Archiving is a feature to moves a partition's files into a Hadoop Archive (HAR). Note that only the file count will be reduced; HAR does not provide any compression. See LanguageManual Archiving for more information Alter Either Table or PartitionAlter Table/Partition File Format
This statement changes the table's (or partition's) file format. For available file_format options, see the section above on CREATE TABLE. Alter Table/Partition Location
Alter Table/Partition Touch
TOUCH reads the metadata, and writes it back. This has the effect of causing the pre/post execute hooks to fire. An example use case is if you have a hook that logs all the tables/partitions that were modified, along with an external script that alters the files on HDFS directly. Since the script modifies files outside of hive, the modification wouldn't be logged by the hook. The external script could call TOUCH to fire the hook and mark the said table or partition as modified. Also, it may be useful later if we incorporate reliable last modified times. Then touch would update that time as well. Note that TOUCH doesn't create a table or partition if it doesn't already exist. (See Create Table.) Alter Table/Partition Protections
Protection on data can be set at either the table or partition level. Enabling NO_DROP prevents a table from being dropped. Enabling OFFLINE prevents the data in a table or partition from being queried, but the metadata can still be accessed. If any partition in a table has NO_DROP enabled, the table cannot be dropped either. Conversely, if a table has NO_DROP enabled then partitions may be dropped, but with NO_DROP CASCADE partitions cannot be dropped either unless the drop partition command specifies IGNORE PROTECTION. Alter Table/Partition Compact
In general you do not need to request compactions when Hive transactions are being used, because the system will detect the need for them and initiate the compaction. However, if compaction is turned off for a table or you want to compact the table at a time the system would not choose to, ALTER TABLE can initiate the compaction. The statement will enqueue a request for compaction and return. To watch the progress of the compaction, use SHOW COMPACTIONS. The compaction_type can be MAJOR or MINOR. See the Basic Design section in Hive Transactions for more information. Alter Table/Partition Concatenate
If the table or partition contains many small RCFiles or ORC files, then the above command will merge them into larger files. In case of RCFile the merge happens at block level whereas for ORC files the merge happens at stripe level thereby avoiding the overhead of decompressing and decoding the data. Alter ColumnRules for Column NamesColumn names are case insensitive.
Change Column Name/Type/Position/Comment
This command will allow users to change a column's name, data type, comment, or position, or an arbitrary combination of them. The PARTITION clause is available in Hive 0.14.0 and later; see Upgrading Pre-Hive 0.13.0 Decimal Columns for usage. A patch for Hive 0.13 is also available (see HIVE-7971). The CASCADE|RESTRICT clause is available in Hive 0.15.0. ALTER TABLE CHANGE COLUMN with CASCADE command changes the columns of a table's metadata, and cascades the same change to all the partition metadata. RESTRICT is the default, limiting column change only to table metadata.
Add/Replace Columns
ADD COLUMNS lets you add new columns to the end of the existing columns but before the partition columns. This is supported for Avro backed tables as well, for Hive 0.14 and later. REPLACE COLUMNS removes all existing columns and adds the new set of columns. This can be done only for tables with a native SerDe (DynamicSerDe, MetadataTypedColumnsetSerDe, LazySimpleSerDe and ColumnarSerDe). Refer to Hive SerDe for more information. REPLACE COLUMNS can also be used to drop columns. For example, " The PARTITION clause is available in Hive 0.14.0 and later; see Upgrading Pre-Hive 0.13.0 Decimal Columns for usage. The CASCADE|RESTRICT clause is available in Hive 0.15.0. ALTER TABLE ADD|REPLACE COLUMNS with CASCADE command changes the columns of a table's metadata, and cascades the same change to all the partition metadata. RESTRICT is the default, limiting column changes only to table metadata.
Partial Partition SpecificationAs of Hive 0.14 (HIVE-8411), users are able to provide a partial partition spec for certain above alter column statements, similar to dynamic partitioning. So rather than having to issue an alter column statement for each partition that needs to be changed:
... you can change many existing partitions at once using a single ALTER statement with a partial partition specification:
Similar to dynamic partitioning, hive.exec.dynamic.partition must be set to true to enable use of partial partition specs during ALTER PARTITION. This is supported for the following operations:
|
...