...
- Creating an Iceberg table using normal create command
Code Block | ||
---|---|---|
| ||
CREATE TABLE TBL_ICE (ID INT) STORED BY ICEBERG; |
...
- Creating an Iceberg Table using CTAS
Code Block | ||
---|---|---|
| ||
CREATE TABLE TABLE_CTAS AS SELECT * FROM SRCTABLE STORED BY ICEBERG; |
...
- Creating an Iceberg Table using CTLT
Code Block | ||
---|---|---|
| ||
CREATE TABLE_CTLT LIKE SRCTABLE STORED BY ICEBERG; |
...
The iceberg tables support both v1 & v2 tables, the tables by default are created as v1 table unless explicitly specified. To create a v2 table ('format-version'='2') can be specified as table property while creating the table
Example:
Code Block | ||
---|---|---|
| ||
CREATE TABLE V2_TABLE (ID INT) STORED BY ICEBERG TBLPROPERTIES ('format-version'='2'); |
...
The iceberg table currently supports three file formats: PARQUET, ORC & AVRO. The default file format is Parquet. The file format can be explicitily provided by using STORED AS <Format> while creating the table
Example-1:
Code Block | ||
---|---|---|
| ||
CREATE TABLE ORC_TABLE (ID INT) STORED BY ICEBERG STORED AS ORC; |
The above creates a v1 iceberg table named 'ORC_TABLE' of ORC file format.
Example-2:
Code Block | ||
---|---|---|
| ||
CREATE TABLE V2_ORC_TABLE (ID INT) STORED BY ICEBERG STORED AS ORC TBLPROPERTIES ('format-version'='2'); |
...
Any Hive external table can be converted into an iceberg tables, without actually rewriting the data files again. We can use ALTER TABLE <TABLE NAME> CONVERT TO ICEBERG [TBLPROPERTIES] to convert any existing external table to an iceberg table.
Code Block | ||
---|---|---|
| ||
ALTER TABLE TABLE1 CONVERT TO ICEBERG TBLPROPERTIES ('format-version'='2'); |
...
Iceberg tables support all query statements similar to any other hive table.
Example:
Code Block | ||
---|---|---|
| ||
SELECT * FROM TBL_ICE WHERE ID > 5; |
...
Iceberg tables supports all data ingestion methods supported with hive
- Insert-Into
Code Block | ||
---|---|---|
| ||
INSERT INTO TBL_ICE VALUES (1),(2),(3),(4); |
...
- Insert-Overwrite
Code Block | ||
---|---|---|
| ||
INSERT OVERWRITE TBL_ICE SELECT * FROM TABLE1; |
- Delete
Code Block | ||
---|---|---|
| ||
DELETE FROM TBL_ICE WHERE ID=5; |
- Update
Code Block | ||
---|---|---|
| ||
UPDATE TBL_ICE WHERE ID=8 SET ID=2; |
- LOAD DATA
Code Block | ||
---|---|---|
| ||
LOAD DATA LOCAL INPATH '/data/files/doctors.avro' OVERWRITE INTO TABLE ice_avro; |
If the config “hive.load.data.use.native.api” is set to “false”, hive will rather than using the Append api, will launch a Tez job to do the LOAD.
Metadata tables:
Hive supports the following metadata tables for Iceberg tables
...
Iceberg tables supports branches & tags, the details around the feature can be read here
Snapshot Management: todo
- Expire Snapshot
- Rollback Snapshot
- Set Current Snapshot
...