Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Expanded alternatives.

...

  1. Watermark propagation in multi-stage flows.  By persisting watermark information that arrives at the sink, subsequent stages may construct source watermarks with higher fidelity than with current approaches.   An example of this scenario may be seen in the Flink codebase, see FlinkKafkaShuffle.java.
  2. Sophisticated sinks that flush internal buffers, mutate state, or otherwise take action at certain points in event time. 

Note that support for event-time timers is out-of-scope of this proposal.

Public Interfaces

New methods are proposed on the various sink interfaces.

...

  1. Update the public sink interfaces as described above.
  2. Update the legacy streaming sink operator (StreamSink) and the new unified sink operator (AbstractSinkWriterOperator) to invoke the new interface methods.Update

Compatibility, Deprecation, and Migration Plan

...

The functionality will be tested alongside an experimental version of Apache Pulsar which supports watermarking on its producer and consumer API.  The testing will focus on watermark propagation under various conditions (e.g. parallelism, transactions, task recovery).

Rejected Alternatives

...

  1. Combine a process function and a sink function to achieve similar effects with increased complexity.
  2. Introduce event-time timers to sink functions.  This approach seems too element-centric for the use cases.   Also, event-time timers are generally keyed.