Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Proposal (potentially not in 1.12): Build custom support for broadcast state pattern where the broadcast side is read first.

Relational methods in DataStream

As discussed in FLIP-131: Consolidate the user-facing Dataflow SDKs/APIs (and deprecate the DataSet API) we see the Table API/SQL as the relational API, where we expect users to work with schemas and fields. Going forward, we envision the DataStream API to be "slightly" lower level API, with a more explicit control over the execution graph, operations, and state. Having said that we think it is worth deprecating and removing in the future all relational style methods in DataStream, which often use Reflection to access the fields and thus are less performant than providing an explicit extractors such as:

  • DataStream#project
  • Windowed/KeyedStream#sum,min,max,minBy,maxBy
  • DataStream#keyBy where the key specified with field name or index

Sinks

Current exactly-once sinks in DataStream rely heavily on Flink’s checkpointing mechanism and will not work with batch scheduling. Support for exactly-once sinks is outside the scope of this FLIP and there will be a separate one coming soon.

...