You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Status

Current stateUnder Discussion

Discussion thread: here (<- link to https://mail-archives.apache.org/mod_mbox/flink-dev/)

JIRA:  Unable to render Jira issues macro, execution error.

Released: <Flink Version>

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

The pluggable shuffle framework was Introduced to Flink by FLIP-31 and based on it we Implement our own remote shuffle service for Flink. During the implementation process, we found that the current interface incurs some limitations which can influence the usability of the 

Public Interfaces

ShuffleMaster: the methods start, close, registerJob, unregisterJob are newly added, the signature of registerPartitionWithProducer is changed (adding a JobID). Other methods are kept unchanged.


JobShuffleContext: it is newly added interface which contains the job level context which can provide some basic capabilities and proxy to other components of the Flink cluster like JobMaster.

public interface JobShuffleContext {

/** Returns the corresponding {@link JobID}. */
JobID getJobID();

/**
* Stops tracking the target result partitions, which means these partitions will be removed and
* will be reproduced if used afterwards.
*/
CompletableFuture<?> stopTrackingPartitions(Collection<ResultPartitionID> partitionIDS);
}

ShuffleMasterContext:


ShuffleServiceFactory:

/**
* Job level shuffle context which can offer some job information like job ID and through it, the
* shuffle plugin can stop tracking the lost result partitions.
*/
public interface JobShuffleContext {

/** Returns the corresponding {@link JobID}. */
JobID getJobID();

/**
* Stops tracking the target result partitions, which means these partitions will be removed and
* will be reproduced if used afterwards.
*/
CompletableFuture<?> stopTrackingPartitions(Collection<ResultPartitionID> partitionIDS);
}

Proposed Changes

There are four main parts of the proposed changes:


Compatibility, Deprecation, and Migration Plan

This change to the ShuffleMaster#registerPartitionWithProducer and 

Test Plan

The proposed changes will be tested by a real Flink shuffle plugin based on it with test jobs like TPC-DS benchmark suit.

Rejected Alternatives

No rejected alternatives.

Future Works

This FLIP only contains some minimal changes needed toward the final . More opti

  • No labels