Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Previously Flink supported bounded iteration with DataSet API and supported the unbounded iteration with DataStream API. However, since Flink aims to deprecate the DataSet API and the iteration in the DataStream API is rather incomplete, thus we would require to re-implement a new iteration library in the Flink-ml repository to support the algorithms. 

Besides, the previous DataStream and DataSet iteration APIs also have some caveats to support algorithm implementation:

  1. Lack of deterministic termination detection and checkpoint support for the DataStream iteration.
  2. Lack of the support for multiple inputs, arbitrary outputs and nested iteration for both iteration APIs.
  3. Lack of asynchronous iteration support for the DataSet iteration. 
  4. The current 

Overall Design

To reduce the development and maintenance overhead, it would be preferred to have a unified implementation for different types of iterations. In fact, the different iteration types shares the same requirements in runtime implementation:

...