Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: fixed links

...

Use STORED AS TEXTFILE if the data needs to be stored as plain text files. Use STORED AS SEQUENCEFILE if the data needs to be compressed. Please read more about Hive- CompressedStorage if you are planning to keep data compressed in your Hive tables. Use INPUTFORMAT and OUTPUTFORMAT to specify the name of a corresponding InputFormat and OutputFormat class as a string literal, e.g. 'org.apache.hadoop.hive.contrib.fileformat.base64.Base64TextInputFormat'.

Use STORED BY to create a non-native table, for example in HBase. See Hive- StorageHandlers for more information on this option.

...

There are two parts in CTAS, the SELECT part can be any SELECT statement supported by HiveQL. The CREATE part of the CTAS takes the resulting schema from the SELECT part and creates the target table with other table properties such as the SerDe and storage format. The only restrictions in CTAS is that the target table cannot be a partitioned table (nor can it be an external table).

...

The CLUSTERED BY and SORTED BY creation commands do not affect how data is inserted into a table – only how it is read. This means that users must be careful to insert data correctly by specifying the number of reducers to be equal to the number of buckets, and using CLUSTER BY and SORT BY commands in their query.

There is also an example of creating and populating bucketed tables.

Drop Table

Code Block
DROP TABLE [IF EXISTS] table_name

...

Archiving is a feature to moves a partition's files into a Hadoop Archive (HAR). Note that only the file count will be reduced; HAR does not provide any compression. See Hive-LanguageManual - Archiving for more information

Create/Drop View

...