All proposed and accepted APE proposals should be put under this page. They should also follow the template below:
Status: Proposed
dev@asterixdb.a.o thread | (permalink to discuss thread) | ||||||||
---|---|---|---|---|---|---|---|---|---|
JIRA issue(s) |
| ||||||||
Release target | (release number) |
Members
- Hari Kishore Chaparala
- Ian Maxon
- Mike Carey
...
- A table format specification
- A set of APIs and libraries for engines to interact with tables following that specification
- Iceberg tables are stored in 3 layers,
- Catalog – A catalog is essentially a collection of tables and provides a way to manage tables in a consistent and standardized way. Stores pointers to tables’ metadata file locations. The catalog also supports atomic operations to update metadata pointers. Some implementations: Hive, Nessie, and Hadoop.
- Metadata –
- Metadata file: contains table schema, snapshot information, partition spec, table stats, etc.
- Manifest List: list of manifest files that make up a snapshot.
- Manifest files: Tracks data files and stats about each file.
- Data – Table data is stored in multiple data files in Parquet, Avro, and ORC. This is called the data layer.
Planned Iceberg support:
Feature | AsterixDB Support | Iceberg availability | Comments |
Iceberg format version | Version 1 | Iceberg currently has 2 format versions: 1 & 2.
| Currently, AsterixDB parallelizes external data file reading using various Hyracks workers across all nodes. Delete files can be large, so the ideal flow should be to process the delete-files parallelly in the first stage and in the second stage, process the data files parallelly with a filter on the collated delete-file data. This 2-step process requires some flow changes in the query execution and hence skipped for the first cut. |
Datafile format | Parquet | Iceberg spec supports Parquet, Avro, and ORC | We currently only support Parquet file processing. |
Table types | File System Tables |
| Initially we are planning to support file system tables but it is possible to add support for meta store tables sign the interfaces provided by the Iceberg SDK.
But ideally, we would want AsterixDB itself to act as the metastore in the long run. |
File Systems | S3 and HDFS | Localfs, HDFS, S3, and any file system supported by the Hadoop library. | AsterixDB supports parquet file processing in S3 and HDFS |
CRUD operations | Read | Iceberg allows all CRUD operations including schema and partition evolution. | Complete CRUD support can be achieved using the Iceberg SDK with which we can take control of updating the Iceberg metadata files. But the current use case is, we already have iceberg tables managed by other engines and AsterixDB is used as a query (read-only) engine. So we are restricting ourselves to read support. Ideally, we would want AsterixDB to act as a metastore for iceberg tables and perform all CRUD operations using SQL++. This can be picked as part of the initiative where we start supporting updates on external data. ASterixDB is currently only read-only on all external datasets. |
Time travel | Only the latest snapshot is queried | Iceberg allows time travel queries. | Every time we run a query, we use the latest iceberg snapshot. In order to support time travel, we would need new keywords in SQL++. Ex: Select * from XYZ AS OF 2023-01-01. |
Proposed changes
Interface:
...
CREATEEXTERNAL DATASET IcebergDataset(IcebergTableType) USING S3 (
("accessKeyId"="-----"),
("secretAccessKey"="--------"),
("region"="us-east-1"),
("serviceEndpoint"="https://s3.amazonaws.com/"),
("container"="my-bucket"),
("definition"="my-table"),
("table-format"="apache-iceberg"),
("format"="parquet")
);
CREATEEXTERNAL DATASET IcebergDataset(IcebergTableType) USING hdfs |
Changes
During the compilation phase of the iceberg external datasets queries, an adapter gets created which is configured to use a path parameter (a comma-separated string of data file locations). We modify this path parameter so that it uses the data files from the latest Iceberg table snapshot. This is done in ExternalDataUtils.prepare call which is common for all external data adapters.
...
Add the below dependency in the module: asterix-external-data (compile scope) and asterix-app ( test scope)
<!-- https://mvnrepository.com/artifact/org.apache.iceberg/iceberg-core --> |
Compatibility/Migration
The proposed changes don’t modify any existing flow or behavior and can be seen as an extension.
...
Iceberg-core uses a newer version of apache-avro which is in conflict with kite-sdk, a unit test scope dependency that we are currently using to infer AVRO schema from JSON data files. Kite-sdk is last updated in 2015 and so we are planning to remove it as a dependency in AsterixDB and build our own schema inference utility with kite-sdk as the reference.
Testing
E2E tests on HDFS and S3 external adapters for
Reading iceberg data
Reading from the latest snapshot
Reading when the table is empty
Reading when there are multiple datafiles
Negative cases:
Should throw an appropriate error on iceberg format version 2
Should throw an appropriate error on invalid metadata path
Should throw an appropriate error if the iceberg table is a Metastore table
Should throw an appropriate error if the data file format is not parquet.
Performance:
Measure performance difference with and without iceberg when querying large files.
...