...
- Initialize all ExecutionJobVertex whose parallelism has been decided. We can obtain the initialization information from the replayed events (ExecutionJobVertexInitializedEvent).
- According to the information in JobMasterPartitionTracker, the execution vertices whose produced partitions are all tracked will be marked as finished.
- For execution vertices that are not marked as finished, as mentioned above, if its corresponding job vertex has operator coordinators, we need to call subtaskReset for them.
- Find all sink/leaf execution vertices in ExecutionGraph. For each sink/leaf execution vertex in the non-finish state, recursively find all its upstream vertices that need to be restarted (which are in unfinished state), and then start scheduling based on this.
interface ShuffleMaster<T extends ShuffleDescriptor> extends AutoCloseable { //… other methods /** * Get all partitions and their metrics, the metrics mainly includes the meta information |
of partition(partition bytes, etc). * @param jobId ID of the target job * @return All partitions belongs to the target job and their metrics */ Collection<PartitionWithMetrics> getAllPartitionWithMetrics(JobID jobId); interface PartitionWithMetrics { ShuffleMetrics getPartitionMetrics(); ShuffleDescriptor getPartition(); } interface ShuffleMetrics { ResultPartitionBytes getPartitionBytes(); } } |
Compatibility, Deprecation, and Migration Plan
...