Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

...

Page properties


Discussion thread

...

JIRA: N/A

...

ASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-18450

Release1.15


Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

POC of the proposed changes: https://github.com/pnowojski/flink/commits/aligned-sources
POC for split-based alignment: https://github.com/AHeise/flink/pull/2

Motivation

When using event time and watermark and if there is a slight imbalance or data skew on the source level, Flink can enter into a degenerate state, where different source operator instances are way ahead of others in terms of event time when ingesting records. For example due to backpressure combined with data skew, one source instance can be many hours behind another instance. This on its own is not a problem per se. However for downstream operators that are using watermarks to emit some data it can actually become a problem. Because of that such downstream operator (like windowed joins on aggregations) might need to buffer excessive amount of data, as the minimal watermark from all of its inputs is held back by the lagging source instance. All records emitted by the non-backpressured sources will hence have to be buffered on the said downstream operator state, which can lead into uncontrollable growth of the operator's state.

...