Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Currently, supporting such use-case requires the use of the STREAMING execution mode, where all operators must be deployed at the start of the job, and operators cannot apply typical "batch-mode" optimizations like sort join. This approach has a couple drawbacks: 1) operatorB has to be deployed while not being able to do real any useful work; 2) operatorB might waste a lot of memory and disk IO on buffering the records from the unbounded input until it receives records from the pipelinedRegionA; and 3) the throughput of aggregation operations (e.g. co-group) can be 10X worse in stream mode than in batch mode.

...