...
The final tasks of the job are all generated by Planner. We want to complete the create table/drop table through Hook on the JM side, so we need an API to register the Hook on the JM side.
Introduce the operation process of CTAS in Planner:
step1:
...
Flink's current Hook design cannot meet the needs of CTAS. For example, the JobListener is on the Client side; JobStatusListener is on the JM side, but it cannot be serialized. Thus we tend to propose a new interface JobStatusHook, which could be attached to the JobGraph and executed in the JobMaster. The interface will also be marked as Internal.
The process of CTAS in runtime
- When the task starts, the JobGraph will be deserialized, and then the JobStatusHook can also be deserialized.
- Through the previous method of serializing and deserializing Catalog and CatalogBaseTable, when deserializing JobStatusHook, Catalog and CatalogBaseTable will also be deserialized.
- Deserialize CatalogBaseTable using CatalogPropertiesUtil#deserializeCatalogTable method.
- When deserializing a Catalog, first read the Catalog ClassName, then use the framework objenesis to get an empty instance of the Catalog,
- and finally call the Catalog#deserialize method to get a valid Catalog instance.
- When the job is start and the job status changes, the JobStatusHook method will be called:
...