Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Global Aggregate Manager was introduced in FLINK-10886 to support event time synchronization across sources and more generally, coordination of parallel tasks. AFAIK, this was only used in the Kinesis source for an early version of watermark alignment. Operator Coordinator, introduced in FLIP-27, provides a more powerful and elegant solution for that need and is part of the new source API standard. FLIP-217 further provides a complete solution for watermark alignment of source splits on top of the Operator Coordinator mechanism. Furthermore, Global Aggregate Manager manages state in JobMaster object, causing problems for adaptive parallelism changes (FLINK-31245).
Therefore, I propose to deprecate the use of Global Aggregate Manager, which can improve the maintainability of the Flink codebase without compromising its functionality.
Public Interfaces
Class / Method | Annotation |
org.apache.flink.streaming.api.operators.StreamingRuntimeContext#getGlobalAggregateManager | Internal |
org.apache.flink.runtime.execution.Environment#getGlobalAggregateManager | N.A. |
org.apache.flink.runtime.taskexecutor.GlobalAggregateManager | |
org.apache.flink.runtime.taskexecutor.rpc.RpcGlobalAggregateManager | |
org.apache.flink.runtime.jobmaster.JobMasterGateway#updateGlobalAggregate |
Proposed Changes
We propose to deprecate the classes/methods listed above. And remove the related doc under the flink-web repository.
Compatibility, Deprecation, and Migration Plan
The use of Global Aggregate Manager is planned to be deprecated in Flink 1.19 and then finally removed in Flink 2.0. For the users that rely on the Global Aggregate Manager API, they will have to migrate to use OperatorCoordinator.
Test Plan
N/A.
Rejected Alternatives
N/A.