Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • BATCH
    • we divide the graph into independent regions and schedule them individually
    • The boundaries of these regions are the keyBy()’s which are set to blocking
  • STREAMING
    • All tasks are scheduled before elements start flowing in the pipeline
    • keyBy()’s are hash partitions whose elements are forwarded aggressively
  • AUTOMATIC
    • If ALL Sources are bounded, then pick BATCH
    • STREAMING otherwise


How to expose that:

We suggest introducing two new configuration options:

  1. The general option for choosing the overall runtime mode
    1. we will expose it via a configuration under the execution.runtime-mode key with an enum value BATCH/STREAM/AUTOMATIC
    2. through StreamExecutionEnvironment#setRuntimeMode(RuntimeMode mode)
    3. each of the particular modes will set default values of other settings such as ScheduleMode, ShuffleMode, buffer timeout, etc. It will be possible to override the defaults for particular options, but only as long as the value is compatible chosen mode, e.g. shuffle-mode other than ALL_EDGED_PIPELINED is illegal for STREAM mode. Illegal combinations will result in an exception.
  2. An option for choosing the shuffle-mode
    1. exposed only through a configuration option under the execution.shuffle-mode key. We will not expose

Examples:

Users will be able to choose BATCH mode:

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setRuntimeMode(RuntimeMode.BATCH);
Configuration conf = new Configuration();
conf.setString("execution.shuffle-mode", "ALL_EDGES_BLOCKING");
env.configure(conf);


As a configuration option named execution.mode.This will allow users to use it through:

...