Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • DataStream#project
  • Windowed/KeyedStream#sum,min,max,minBy,maxBy
  • DataStream#keyBy where the key specified with field name or index (including ConnectedStreams#keyBy)

Moreover some of the operations have semantics that might make sense for stream processing but should behave differently for batch. For example, KeyedStream.reduce() is essentially a reduce on a GlobalWindow with a Trigger that fires on every element. In DB terms it produces an UPSERT stream as an output, if you get ten input elements for a key you also get ten output records. For batch processing it make more sense to instead only produce one output record per key with the result of the aggregation when we reach the end of stream/key. This will be correct for downstream consumers that expect an UPSERT stream but it will change the actual physical output stream that they see. The methods we suggest changing the behaviour when run in a bounded execution mode include:

...