...
Current state: Under Discussion
Discussion thread:here
http (<- link to https://mail-archives.apache.org/mod_mbox/flink-dev/)apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-123-DDL-and-DML-compatibility-for-Hive-connector-td39633.html
JIRA: here (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)
...
Therefore, we propose to implement DDL and DML for Hive connector, in a HiveQL-compatible way. Users will have the compatibility when they choose to use Hive dialect. DQL is out of the scope of this FLIP and left for the future. With compatible DDL and DML, we believe users can at least migrate some of their scripts without needing to change them.
...
Features for which the underlying functionalities are not ready are out of the scope of this FLIP, e.g. Hive’s CREATE TABLE AS and CONCATENATE won’t be supported.
Since Hive connector only works with blink planner, we’ll only make sure this feature works with blink planner. And it may or may not work with the old planner.
Proposed Changes
Introduce a New Parser
...
Copy Hive’s own grammar and parser/analyzer to handle DDL and DML. While this approach brings good compatibility, it probably requires more intrusive changes and is more difficult to support all Hive versions.
Limited Scope
Features for which the underlying functionalities are not ready are out of the scope of this FLIP, e.g. Hive’s CREATE TABLE AS and CONCATENATE won’t be supported.
Since Hive connector only works with blink planner, we’ll only make sure this feature works with blink planner. And it may or may not work with the old planner.
Calcite and HiveQL have different reserved keywords, e.g. DEFAULT
is a reserved keyword in Calcite and a non-reserved keyword in HiveQL. Since Calcite currently doesn't allow us to change reserved keywords, users have to backtick-quote them when using in FlinkSQL.
The following table summarizes the DDLs that will be supported in this FLIP. Unsupported features are also listed so that we can track them and decide whether/how to support them in the future.
Database | Supported | Comment | Not Supported | Comment |
CREATE | SHOW DATABASES LIKE | Show databases filtering by a regular expression. Missing Catalog API. | ||
DROP | ||||
ALTER | ||||
USE | ||||
SHOW | ||||
DESCRIBE | We don't have a TableEnvironment API for this. Perhaps it's easier to implement when FLIP-84 is in place. | |||
Table | CREATE | Support specifying EXTERNAL, PARTITIONED BY, ROW FORMAT, STORED AS, LOCATION and table properties. Data types will also be in HiveQL syntax, e.g. STRUCT | Bucketed tables | |
DROP | CREATE LIKE | Wait for FLIP-110 | ||
ALTER | Include rename, update table properties, update SerDe properties, update fileformat and update location. | CREATE AS | Missing underlying functionalities, e.g. create the table when the job succeeds. | |
SHOW | Temporary tables | Missing underlying functionalities, e.g. removing the files of the temporary table when session ends. | ||
DESCRIBE | SKEWED BY [STORED AS DIRECTORIES] | Currently we don't use the skew info of a Hive table. | ||
STORED BY | We don't support Hive table with a storage handler yet. | |||
UNION type | ||||
TRANSACTIONAL tables | ||||
DROP PURGE | Data will be deleted w/o going to trash. Applies to either a table or partitions. Missing Catalog API. | |||
TRUNCATE | Remove all rows from a table or partitions. Missing Catalog APIs. | |||
TOUCH, PROTECTION, COMPACT, CONCATENATE, UPDATE COLUMNS | Applies to either a table or partitions. Too Hive-specific or missing underlying functionalities. | |||
SHOW TABLES 'regex' | Show tables filtering by a regular expression. Missing Catalog API. | |||
Partition | ALTER | Include add, drop, update fileformat and update location. | Exchange, Discover, Retention, Recover, (Un)Archive | Too Hive-specific or missing underlying functionalities. |
SHOW | Support specifying partial spec | RENAME | Update a partition's spec. Missing Catalog API. | |
DESCRIBE | We don't have a TableEnvironment API for this. Perhaps it's easier to implement when FLIP-84 is in place. | |||
Column | ALTER | Change name, type, position, comment for a single column. Add new columns. Replace all columns. | ||
Function | CREATE | CREATE FUNCTION USING FILE|JAR… | To support this, we need to be able to dynamically add resources to a session. | |
DROP | RELOAD | Hive-specific | ||
SHOW | SHOW FUNCTIONS LIKE | Show functions filtering by a regular expression. Missing Catalog API. | ||
View | CREATE | Wait for FLIP-71 | SHOW VIEWS LIKE | Show views filtering by a regular expression. Missing Catalog API. |
DROP | Wait for FLIP-71 | |||
ALTER | Wait for FLIP-71 | |||
SHOW | Wait for FLIP-71 | |||
DESCRIBE | Wait for FLIP-71 |
The following table summarizes the DMLs that will be supported in this FLIP. Unsupported features are also listed so that we can track them and decide whether/how to support them in the future.
Supported | Comment | Unsupported | Comment | |
DMLs | INSERT INTO/OVERWRITE PARTITION | Support specifying dynamic partition columns in the specification | Multi-insert |