Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

It will be more user friendly. In addition, the Hive dialect already has some support for CTAS.

Proposed Changes

Create table must go through catalog, and in memory catalog is not support CTAS, must be a external catalog.


I suggest introducing a LIKE clause with a following syntax:

...

  1. Create the sink table  in the catalog based on the schema of the query result.
  2. Start the job and write the result to a temporary directory.
  3. If the job executes successfully, then load data into the sink table.
  4. If the job execution fails, then drop the sink table.(This capability requires runtime module support, such as hook, and SQL passes relevant parameters to the runtime module.)

Drop a Table the table if the job fails requires some additional support:

Code Changes

parserImpls.ftl

add syntax support

...

ContextResolvedTable

Add CTAS flag

@Internal
public final class ContextResolvedTable {

...

  • TableSink needs to provide the CleanUp API, developers implement as needed. Do nothing by default. If an exception occurs, can use this API to drop table or delete the temporary directory, etc.

Precautions

when need drop table:

  1. User manually cancel the job.
  2. Job final FAILED status, such as after exceeds the maximum number of task Failovers.

Drop table and TableSink are strongly bound:

Do not do drop table operations in the framework, drop table is implemented in TableSink according to the needs of specific TableSink

...

  private ContextResolvedTable(
ObjectIdentifier objectIdentifier,
@Nullable Catalog catalog,
ResolvedCatalogBaseTable<?> resolvedTable,
boolean anonymous,
boolean isCTAS) {
this.objectIdentifier = Preconditions.checkNotNull(objectIdentifier);
this.catalog = catalog;
this.resolvedTable = Preconditions.checkNotNull(resolvedTable);
this.anonymous = anonymous;
this.isCTAS = isCTAS;
}
  public boolean isCTAS() {
return isCTAS;
}

}

When initializing TableSink, we can distinguish whether it is a CTAS operation.

Support in Table API

The executeSql method will be reused

...

  1. Support SELECT clause in CREATE TABLE(CTAS)
  2. MySQL CTAS syntax
  3. Microsoft Azure Synapse CTAS
  4. LanguageManual DDL#Create/Drop/ReloadFunction
  5. Spark Create Table Syntax

...