Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Public Interfaces


Introduce createTwoPhaseCatalogTable getTwoPhaseCommitCreateTable API for Catalog.

Code Block
languagejava
@PublicEvolving
public interface Catalog {

    /**
     * Create a {@link TwoPhaseCatalogTableTwoPhaseCommitCatalogTable} that provided transaction abstraction.
     * TwoPhaseCatalogTableTwoPhaseCommitCatalogTable will be combined with {@link JobStatusHook} to achieve atomicity
     * support in the Flink framework. Default returns empty, indicating that atomic operations are
     * not supported, then using non-atomic implementations.
     *
     * <p>The framework will make sure to call this method with fully validated {@link
     * ResolvedCatalogTable}.
     *
     * @param tablePath path of the table to be created
     * @param table the table definition
     * @param ignoreIfExists flag to specify behavior when a table or view already exists at the
     *     given path: if set to false, it throws a TableAlreadyExistException, if set to true, do
     *     nothing.
     * @param isStreamingMode A flag that tells if the current table is in stream mode, Different
     *     modes can have different implementations of atomicity support.
     * @return {@link TwoPhaseCatalogTableTwoPhaseCommitCatalogTable} that can be serialized and provides atomic
     *     operations
     * @throws TableAlreadyExistException if table already exists and ignoreIfExists is false
     * @throws DatabaseNotExistException if the database in tablePath doesn't exist
     * @throws CatalogException in case of any runtimerunti
me exception
     */
    default Optional<TwoPhaseCatalogTable>Optional<TwoPhaseCommitCatalogTable> createTwoPhaseCatalogTablegetTwoPhaseCommitCreateTable(
            ObjectPath tablePath,
            CatalogBaseTable table,
            boolean ignoreIfExists,
            boolean isStreamingMode)
            throws TableAlreadyExistException, DatabaseNotExistException, CatalogException {
        return Optional.empty();
    }

}


Introduce TwoPhaseCatalogTable TwoPhaseCommitCatalogTable interface that support atomic operations.

Code Block
languagejava
/**
 * A {@link CatalogTable} for atomic semantics using a two-phase commit protocol, combined with
 * {@link JobStatusHook} for atomic CTAS. {@link TwoPhaseCatalogTableTwoPhaseCommitCatalogTable} will be a member
 * variable of CtasJobStatusHook and can be serialized;
 *
 * <p>
 * CtasJobStatusHook#onCreated will call the beginTransaction method of TwoPhaseCatalogTableTwoPhaseCommitCatalogTable;
 * CtasJobStatusHook#onFinished will call the commit method of TwoPhaseCatalogTableTwoPhaseCommitCatalogTable;
 * CtasJobStatusHook#onFailed and CtasJobStatusHook#onCanceled will call the abort method of
 * TwoPhaseCatalogTableTwoPhaseCommitCatalogTable;
 */
@PublicEvolving
public interface TwoPhaseCatalogTableTwoPhaseCommitCatalogTable extends CatalogTable, Serializable {

    /**
     * This method will be called when the job is started. Similar to what it means to open a
     * transaction in a relational database; In Flink's atomic CTAS scenario, it is used to do some
     * initialization work; For example, initializing the client of the underlying service, the tmp
     * path of the underlying storage, or even call the start transaction API of the underlying
     * service, etc.
     */
    void beginTransaction();

    /**
     * This method will be called when the job is succeeds. Similar to what it means to commit the
     * transaction in a relational database; In Flink's atomic CTAS scenario, it is used to do some
     * data visibility related work; For example, moving the underlying data to the target
     * directory, writing buffer data to the underlying storage service, or even call the commit
     * transaction API of the underlying service, etc.
     */
    void commit();

    /**
     * This method will be called when the job is failed or canceled. Similar to what it means to
     * rollback the transaction in a relational database; In Flink's atomic CTAS scenario, it is
     * used to do some data cleaning; For example, delete the data in tmp directory, delete the
     * temporary data in the underlying storage service, or even call the rollback transaction API
     * of the underlying service, etc.
     */
    void abort();
}

...

First we need to have a Table interface that can be combined with the abstract transaction capability, so we introduce TwoPhaseCatalogTable TwoPhaseCommitCatalogTable, which can perform start transaction, commit transaction, and abort transaction operations.

The three APIs corresponding to TwoPhaseCatalogTableTwoPhaseCommitCatalogTable:

beginTransaction : Similar to open transactions, we can do some prep work, such as initializing the client, initializing the data, initializing the directory, etc.

...

abort : Similar to abort transactions, we can do some data cleaning, data restoration, etc.

Note: TwoPhaseCatalogTable must TwoPhaseCommitCatalogTable must be serializable, because it used on JM.

Then we need somewhere to create the TwoPhaseCatalogTableTwoPhaseCommitCatalogTable, because different Catalogs implement atomic CTAS and need to perform different operations,

for example, HiveCatalog needs to access the Hive Metastore; JDBCCatalog needs to access the back-end database, so we introduce the createTwoPhaseCatalogTable API getTwoPhaseCommitCreateTable API on the Catalog interface.

The definition of the createTwoPhaseCatalogTable API getTwoPhaseCommitCreateTable API is similar to that of the createTable API, with the extension of the isStreamingMode parameter, in order to provide a different atomicity implementation in different modes.

...

Introduce CtasJobStatusHook (implements JobStatusHook interface), TwoPhaseCatalogTable is TwoPhaseCommitCatalogTable is its member variable; 

The implementation of the API related to the call to TwoPhaseCatalogTable is TwoPhaseCommitCatalogTable is as follows: 

Code Block
languagejava
/**
 * This Hook is used to implement the Flink CTAS atomicity semantics, calling the corresponding API
 * of {@link TwoPhaseCatalogTable} at different stages of the job.
 */
public class CtasJobStatusHook implements JobStatusHook {

    private final TwoPhaseCatalogTable twoPhaseCatalogTable;

    public CtasJobStatusHook(TwoPhaseCatalogTable twoPhaseCatalogTable) {
        this.twoPhaseCatalogTable = twoPhaseCatalogTable;
    }

    @Override
    public void onCreated(JobID jobId) {
        twoPhaseCatalogTabletwoPhaseCommitCatalogTable.beginTransaction();
    }

    @Override
    public void onFinished(JobID jobId) {
        twoPhaseCatalogTabletwoPhaseCommitCatalogTable.commit();
    }

    @Override
    public void onFailed(JobID jobId, Throwable throwable) {
        twoPhaseCatalogTabletwoPhaseCommitCatalogTable.abort();
    }

    @Override
    public void onCanceled(JobID jobId) {
        twoPhaseCatalogTabletwoPhaseCommitCatalogTable.abort();
    }
}

Compatibility with existing non-atomic CTAS

The return value of Catalog#createTwoPhaseCatalogTable Catalog#getTwoPhaseCommitCreateTable is Optional, and we can determine whether atomicity semantics are supported based on whether the return value is empty:

...

not empty : it means that atomicity semantics are supported, then create a CtasJobStatusHook and use the JobStatusHook mechanism to implement atomicity semantics, as described in the code in the previous section.

Code Block
languagejava
Optional<TwoPhaseCatalogTable>Optional<TwoPhaseCommitCatalogTable> twoPhaseCatalogTableOptional =
        ctasCatalog.createTwoPhaseCatalogTablegetTwoPhaseCommitCreateTable(
                objectPath,
                catalogTable,
                createTableOperation.isIgnoreIfExists(),
                isStreamingMode);

if (twoPhaseCatalogTableOptionaltwoPhaseCommitCatalogTableOptional.isPresent()) {
	// use TwoPhaseCatalogTableTwoPhaseCommitCatalogTable for atomic CTAS statements
    TwoPhaseCatalogTableTwoPhaseCommitCatalogTable twoPhaseCatalogTabletwoPhaseCommitCatalogTable =
            twoPhaseCatalogTableOptionaltwoPhaseCommitCatalogTableOptional.get();
    CtasJobStatusHook ctasJobStatusHook =
            new CtasJobStatusHook(twoPhaseCatalogTabletwoPhaseCommitCatalogTable);
    mapOperations.add(
            ctasOperation.toSinkModifyOperation(
                    createTableOperation.getTableIdentifier(),
                    createTableOperation.getCatalogTable(),
                    twoPhaseCatalogTabletwoPhaseCommitCatalogTable,
                    ctasCatalog,
                    catalogManager));
    jobStatusHookList.add(ctasJobStatusHook);
} else {
    // execute CREATE TABLE first for non-atomic CTAS statements
    executeInternal(ctasOperation.getCreateTableOperation());
    mapOperations.add(ctasOperation.toSinkModifyOperation(catalogManager));
}

...

so we introduce isStreamingMode when we define Catalog#createTwoPhaseCatalogTableCatalog#getTwoPhaseCommitCreateTable, and Catalog can decide whether to provide atomicity semantic support.

...

Then implementation of the atomic CTAS operation requires only two steps :

  1. Catalog implements the createTwoPhaseCatalogTable methodgetTwoPhaseCommitCreateTable method;
  2. Introduce the implementation class of the TwoPhaseCatalogTable interfaceTwoPhaseCommitCatalogTable interface.

HiveCatalog implements the createTwoPhaseCatalogTable APIgetTwoPhaseCommitCreateTable API:

Code Block
languagejava
	@Override
    public Optional<TwoPhaseCatalogTable>Optional<TwoPhaseCommitCatalogTable> createTwoPhaseCatalogTablegetTwoPhaseCommitCreateTable(
            ObjectPath tablePath, CatalogBaseTable table, boolean ignoreIfExists, boolean isStreamingMode)
            throws TableAlreadyExistException, DatabaseNotExistException, CatalogException {

        if (isStreamingMode) {
            //HiveCatalog does not support atomicity semantics in stream mode
            return Optional.empty();
        }

        checkNotNull(tablePath, "tablePath cannot be null");
        checkArgument(table instanceof ResolvedCatalogBaseTable, "table should be resolved");

        ResolvedCatalogBaseTable<?> resolvedTable = (ResolvedCatalogBaseTable<?>) table;
        if (!databaseExists(tablePath.getDatabaseName())) {
            throw new DatabaseNotExistException(getName(), tablePath.getDatabaseName());
        }
        if (!ignoreIfExists && tableExists(tablePath)) {
            throw new TableAlreadyExistException(getName(), tablePath);
        }

        boolean managedTable = ManagedTableListener.isManagedTable(this, resolvedTable);

        Table hiveTable =
                HiveTableUtil.instantiateHiveTable(
                        tablePath, resolvedTable, hiveConf, managedTable);

        TwoPhaseCatalogTableTwoPhaseCommitCatalogTable twoPhaseCatalogTabletwoPhaseCommitCatalogTable = new HiveTwoPhaseCatalogTableHiveTwoPhaseCommitCatalogTable(
                getHiveVersion(),
                new JobConfWrapper(JobConfUtils.createJobConfWithCredentials(hiveConf)),
                hiveTable,
                ignoreIfExists);

        return Optional.of(twoPhaseCatalogTabletwoPhaseCommitCatalogTable);
    }

...

HiveTwoPhaseCommitCatalogTable implements the core logic

Code Block
languagejava
/**
 * An implementation of {@link TwoPhaseCatalogTable} for Hive to
 * support atomic ctas.
 */
public class HiveTwoPhaseCatalogTableHiveTwoPhaseCommitCatalogTable implements TwoPhaseCatalogTableTwoPhaseCommitCatalogTable {

    private static final long serialVersionUID = 1L;

    @Nullable private final String hiveVersion;
    private final JobConfWrapper jobConfWrapper;

    private final Table table;
    private final boolean ignoreIfExists;

    private transient HiveMetastoreClientWrapper client;

    public HiveTwoPhaseCatalogTableHiveTwoPhaseCommitCatalogTable(
            String hiveVersion,
            JobConfWrapper jobConfWrapper,
            Table table,
            boolean ignoreIfExists) {
        this.hiveVersion = hiveVersion;
        this.jobConfWrapper = jobConfWrapper;
        this.table = table;
        this.ignoreIfExists = ignoreIfExists;
    }

    @Override
    public void beginTransaction() {
        // init hive metastore client
        client =
                HiveMetastoreClientFactory.create(
                        HiveConfUtils.create(jobConfWrapper.conf()), hiveVersion);
    }

    @Override
    public void commit() {
        try {
            client.createTable(table);
        } catch (AlreadyExistsException alreadyExistsException) {
            if (!ignoreIfExists) {
                throw new FlinkHiveException(alreadyExistsException);
            }
        } catch (Exception e) {
            throw new FlinkHiveException(e);
        } finally {
            client.close();
        }
    }

    @Override
    public void abort() {
        client.close();
    }
}

...