Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The definition of PartitionWindowedStream

We first propose to add a new method fullWindowPartition() in DataStream class, which represents openning a full window on DataStream

The PartitionWindowedStream does not support setting the parallelism and has the same parallelism with previous operator.

To express the processing semantics of full window processing on DataStream, we need to define the corresponding method in DataStream class. One approach is to directly add a window(WindowAssigner assigner) method. This method only supports passing the EndOfStreamWindows introduced in FLIP-331. In the job code, if full window processing is desired, the corresponding code would be dataStream.window(EndOfStreamWindows.get()).

However, this approach may lead users to mistakenly pass window assigners other than EndOfStreamWindows to the window(WindowAssigner assigner) method, which is not supported.

To avoid such issues, we propose defining a parameterless fullWindowPartition() method on DataStream to express the processing semantics of full window processing. This method is functionally equivalent to the aforementioned dataStream.window(EndOfStreamWindows.get()). By adopting this approach, users can explicitly apply full window processing without any misunderstandings.

The return type of the fullWindowPartition() method is PartitionWindowedStream, which provides multiple APIs for full window processing, including  mapPartitionsortPartitionaggregate and reduce.

PartitionWindowedStream does not support customizing parallelism. Its parallelism is consistent with the previous operator.

The PartitionWindowedStream class will be annotated by @PublicEnvolving as more APIs may be added in the future.

...