Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Based on FLINK-10205, the logical of code is when task failover the splits will not be strictly consumed by same ExecutionVertex.


we should at least make sure the different attempt of a same task to have the same inputs


So, I will change..,,不要再把inputsplit给return给assign了,而是,清空index,记录每个execution消费到list<InputSplit>里面的第几个index了,也就是记录下标,然后failover以后,就把下标清0,然后多个execution也可以玩得转,然后这里再画一张图,把这个过程说清楚。


Manage middle ResultPartition 

...

这里要画一张图。把failover的情况加进去,然后I think that speculative execution means that two executions in a ExecutionVertex running at a same time, and failover means that two executions running at two different time. Based on this, I think this feature(speculative execution) is theoretically achievable. So, I have implemented a speculative execution for batch job based on Blink, and it had a significant effect in our product cluster.


If a task read from a blocking result partition, when its input is not available, we can ‘revoke’ the produce task, set the task fail and rerun the upstream task to regenerate data.

In certain scenarios producer data was transferred in blocking mode or data was saved in persistent store. If the partition was missing, we need to revoke/rerun the produce task to regenerate the data.

Manage sink files

这个就是对于batch的sink file,到每个文件后面加全局唯一的后缀,然后最后在作业全部结束的时候,在主节点对于各种情况,去rename或者delete.

...