Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. ExecutionJobVertexInitializedEvent: This event is responsible for recording the initialization information of ExecutionJobVertex,  its content contains the decided parallelism of this job vertex, and its input information. This event will be triggered and written out when a job vertex is initialized.

  2. ExecutionVertexFinishedEvent:This event is responsible for recording the information of finished task. Our goal is that all finished tasks don’t need to re-run, so the simple idea is to trigger an event when a task is finished.  The content of this event contains:

    1. The state of the finished task/ExecutionVertex, including IO metrics, accumulators, etc. These contents can be easily obtained from ExecutionGraph.

    2. If the job vertex which this task belongs to has operator coordinators, the states of the operator coordinators also need to be recorded.

In order to obtain the state of operator coordinators, we will enrich the checkpointCoordinatorIn order to obtain the state of operator coordinators, we will enrich the OperatorCoordinator#checkpointCoordinator method to let it accept  -1 (NO_CHECKPOINT) as the value of checkpointId, to support snapshotting the state of operator coordinator in batch jobs. After JM crashes, the operator coordinator can be restored from the previous recorded state. In addition to a simple restore(by resetToCheckpoint OperatorCoordinator#resetToCheckpoint method), it also needs to call subtaskResetOperatorCoordinator#subtaskReset for the non-finished tasks (which may in running state before JM crashes) , because these tasks will be reset and re-run after JM crashes.

...

  1. Initialize all ExecutionJobVertex whose parallelism has been decided. We can obtain the initialization information from the replayed events (ExecutionJobVertexInitializedEvent).
  2. According to the information in JobMasterPartitionTracker, the execution vertices whose produced partitions are all tracked will be marked as finished. 
  3. For execution vertices that are not marked as finished, as mentioned above, if its corresponding job vertex has operator coordinators, we need to call subtaskReset call OperatorCoordinator#subtaskReset for them.
  4. Find all sink/leaf execution vertices in ExecutionGraph. For each sink/leaf execution vertex in the non-finish state, recursively find all its upstream vertices that need to be restarted (which are in unfinished state), and then start scheduling based on this.

interface ShuffleMaster<T extends ShuffleDescriptor> extends AutoCloseable {

    //… other methods  


    /**

     * Get all partitions and their metrics, the metrics mainly includes the meta information of partition(partition bytes, etc).

     * @param jobId ID of the target job

     * @return All partitions belongs to the target job and their metrics

     */

    Collection<PartitionWithMetrics> getAllPartitionWithMetrics(JobID jobId);


    interface PartitionWithMetrics {

        ShuffleMetrics getPartitionMetrics();


        ShuffleDescriptor getPartition();

    }


    interface ShuffleMetrics {

        ResultPartitionBytes getPartitionBytes();

    }

}

Compatibility, Deprecation, and Migration Plan

...