Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For example, as shown below.

Image Added

(a) In a ExecutionVertex, after execution_1 has consumed inputSplit_0, it go on to consume inputSplit_1.

...

      Now inputSplitIndexMap data is { (execution_1_new, 0), (execution_2, 1) }.

Image Removed


Manage middle ResultPartition 

这里要画一张图。把failover的情况加进去,然后I think that speculative execution means that two multiple executions in a ExecutionVertex running at a the same time, and failover means that two executions running at two different time. Based on this, I think this feature(speculative execution) is theoretically achievable. So, I have implemented a speculative execution for batch job based on Blink, and it had a significant effect in our product cluster.

Two executions of a upstream ExecutionVertex will produce two ResultPartitions. When the upstream ExecutionVertex finished, we will update the inputChannel of down stream execution to the fastest finished execution of upstream.

when the down stream execution meet DataConsumptionException. It will restart with the upstream execution that has been finished.


Code Block
languagejava
titleDefaultExecutionGraph
public class ExecutionVertex
        implements AccessExecutionVertex, Archiveable<ArchivedExecutionVertex> {
	private Execution fastestFinishedExecution = null;
}


这里要画一张图。把failover的情况加进去,特别是下游带着上游重启的情况。


If a task read from a blocking result partition, when its input is not available, we can ‘revoke’ the produce task, set the task fail and rerun the upstream task to regenerate data.

...