Page History

...

When a task fail, we could calculate its index(executionIndex) in executionList by executionAttemptID. Then the scheduler takes a series of processing for the corresponding execution according to the executionIndex as shown below.

In order to better failover logic, I will extend the calss FailureHandlingResult with an additional member-variable.

Code Block

language	java
title	FailureHandlingResult class extension

public class FailureHandlingResult {
	@Nullable private final Integer executionIndex;
}

Black list of node

Most long tail task are caused by cluster problems, so I must ensure speculative execution runs on different node from origin execution.

...

Page tree

Versions Compared

Old Version 35

New Version 36

Key

Black list of node