Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

I must implement (Job, Host) blacklist for speculative execution feature. In order to implement FLINK-11000  friendly  friendly in the future, my interface also suit other blacklist descripted above.

...

Pass the blacklist information to cluster ResourceManager

...


Yarn

First, nodes' attributes should include machine ip attribute, then we can control containers do not on some mechines by yarn PlacementConstraints.

...

After I read FLINK-10205pr-6684 and code(master branch), I found that Flink now can't ensure the different attempt of a same ExecutionVertex to have the same inputs. Because when a task failover, now simply returning the input splits to the assigner and letting the next idling task take it should work. This is no problem because it should not matter which tasks processes which input split. If a failure occurs and some other task takes over the failed input splits, it would as if this task had processed these input splits from the very beginning.

...

Manage middle ResultPartition 

As shown below, for batch job with blocking shuffle as shown below. Because of introduce speculative execution all reduce executions in an ExecutionVertex will consume the resultPartition of map ExecutionVertex's fastest finished execution.

...