Status

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Currently, the Iteration API of DataStream is incomplete. For instance, it lacks support for iteration in sync mode and exactly once semantics. Additionally, it does not offer the ability to set iteration termination conditions. As a result, it's hard for developers to build an iteration pipeline by DataStream in the practical applications such as machine learning.

FLIP-176: Unified Iteration to Support Algorithms has introduced a unified iteration library in the Flink ML repository. This library addresses all the issues present in the Iteration API of DataStream and could provide solution for all the iteration use-cases. However, maintaining two separate implementations of iteration in both the Flink repository and the Flink ML repository would introduce unnecessary complexity and make it difficult to maintain the Iteration API.

We propose deprecating the Iteration API of DataStream and removing it completely in the next major version. In the future, if other modules in the Flink repository require the use of the Iteration API, we can consider extracting all Iteration implementations from the Flink ML repository into an independent module.


Public Interfaces

Modify the annotation of the following classes and methods to @Deprecated.

Class / MethodAnnotation
org.apache.flink.streaming.api.datastream.DataStream#iterate()


PublicEvolving

org.apache.flink.streaming.api.datastream.DataStream#iterate(long maxWaitTimeMillis)

org.apache.flink.streaming.api.datastream.IterativeStream
org.apache.flink.streaming.api.datastream.IterativeStream.ConnectedIterativeStreamsPublic

Proposed Changes

We propose deprecating the classes/methods mentioned above and subsequently removing the documentation about the Iteration of DataStream from the Flink website.

Compatibility, Deprecation, and Migration Plan

The Iteration API in DataStream is planned be deprecated in Flink 1.19 and then finally removed in Flink 2.0. For the users that rely on the Iteration API in DataStream, they will have to migrate to Flink ML.

Test Plan

N.A.

Rejected Alternatives

N.A.