Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • After this FLIP, the behavior of Flink runtime with execution.runtime-mode = streaming AND execution.checkpointing.interval-during-backlog > 0, will be same as the stream mode prior to this FLIP.
  • The rational for switching the Flink runtime behavior based on whether execution.checkpointing.interval-during-backlog == 0 is that most (if not all) performance optimizations, which allow batch mode to achieve higher throughput than stream mode, involve operations (e.g. sorting inputs) that can only be used when checkpoint is not required.
  • It is possible for mixed mode to be slower than stream mode, particularly when there is only small amount of input records and the overhead of buffering/sorting inputs out-weight its benefit. This is similar to how the merge join might be slower than hash join. This FLIP focuses on optimizing the Flink throughput when there is a high number of input records. In the future, we might introduce more strategies to turn on mix mode in a smart way to avoid performance regression.
  • For an operator with 2+ inputs, where some inputs have isBacklog=true and some other inputs have isBacklog=false, Flink runtime will handle this operator as if all its inputs have isBacklog=false. The rational is that we don't have a reliable way (yet) to decide whether it is OK to delay the processing/emission of this operator's output records. For example, if this operator simply forwards the records from the input with backlog=false to its output, then we probably should not delay its output records.


1) Scheduling strategy

Batch mode:

...