Status
...
Page properties | |
---|---|
|
...
...
...
JIRA: here (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)
...
|
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
...
Currently, one Flink cluster can only use one shuffle service plugin configured by 'shuffle-service-factory.class'. This is not a problem for per-job cluster, but for session cluster, this is not flexible enough and cannot support use cases like selecting different shuffle service for different workloads (e.g. batch vs. streaming). As discussed in https://lists.apache.org/thread/k4owttq9q3cq4knoobrzc31bghf7vc0o, the community has considered to implement this feature when introducing the pluggable shuffle service abstraction, there are also some relevant followup issues:
Jira | ||||||
---|---|---|---|---|---|---|
|
Public Interfaces
To Implement this feature, we need to support job level shuffle service configuration. Currently, only cluster level shuffle service configuration is supported by configuring 'shuffle-service-factory.class'. If a user first starts a Flink cluster and then he/she tries to change the shuffle service by reconfiguring 'shuffle-service-factory.class', he/she will fail because per-job level configuration will not passed to runtime so does not take effect currently.
...
4. Share the same network memory pool (NetworkBufferPool) among different shuffle services. It is compatible with the Flink memory model which treats the network memory as one dimension of resource. In fact, we need a public NetworkMemoryProvider interface to expose the network resource to shuffle service plugin. We can leave this together with other user interface relevant work (see
) as future work. Jira server ASF JIRA serverId 5aa69414-a9e9-3523-82ec-879b028fb15b key FLINK-12873
Here is a PoC implementation: https://github.com/wsry/flink/commit/bf6b56b1cd6591e7b3fad6fddd3f701a06c64868.
Compatibility, Deprecation, and Migration Plan
1. If not configured as default, the shuffle service must be implemented as Flink plugin and which can be loaded by Flink PluginManager.
2. If a custom shuffle service plugin already creates the NetworkBufferPool (not a public interface) instance itself, it must switch to use the shared NetworkBufferPool.
...