Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In DataStream API, a DataStream can be transformed into a KeyedStream through keyBy, and then further transformed into a WindowedStream through window operations. WindowedStream enables window processing on records with the same key. With the improvement of FLIP-331, WindowedStream will support the full window processing by the window assigner EndOfStreamWindows for which the window is only triggered at the end of inputs.

However, full window processing is not supported directly by DataStream. This means the DataStream cannot collect all records of each subtask (these records have no keys) separately into a full window and process them at the end of inputs. DataSet API already supports processing and sorting all records within each subtask through mapPartition API and sortPartition API.  As DataSet API has been deprecated in Flink 1.18, it is necessary to enhance the DataStream to support handling full window processing on individual subtasks.

In this FLIP, we propose two main enhancements. Firstly, we propose enabling DataStream to directly transform into a PartitionWindowedStream. The PartitionWindowedStream represents collecting all records of each subtask separately into a full window. Secondly, we propose supporting four APIs on PartitionWindowedStream, including mapPartition, sortPartition, aggregate and reduce.

...