Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The final tasks of the job are all generated by Planner. We want to complete the create table/drop table through Hook on the JM side, so we need an API to register the Hook on the JM side.

Introduce the operation process of CTAS in Planner:

step1:

...

Flink's current Hook design cannot meet the needs of CTAS. For example, the JobListener is on the Client side; JobStatusListener is on the JM side, but it cannot be serialized. Thus we tend to propose a new interface JobStatusHook, which could be attached to the JobGraph and executed in the JobMaster. The interface will also be marked as Internal. 

The process of CTAS in runtime

  1. When the task starts, the JobGraph will be deserialized, and then the JobStatusHook can also be deserialized.
  2. Through the previous method of serializing and deserializing Catalog and CatalogBaseTable, when deserializing JobStatusHook, Catalog and CatalogBaseTable will also be deserialized.
    • Deserialize CatalogBaseTable using CatalogPropertiesUtil#deserializeCatalogTable method.
    • When deserializing a Catalog, first read the Catalog ClassName, then use the framework objenesis to get an empty instance of the Catalog,
    • and finally call the Catalog#deserialize method to get a valid Catalog instance.
  3. When the job is start and the job status changes, the JobStatusHook method will be called:

...