Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Current stateUnder Discussion

Discussion threadhere (<- link to https://mail-archiveslists.apache.org/thread.html/r11750db945277d944f408eaebbbdc9d595d587fcfb67b015c716404e%40%3Cdev.flink.apache.org%3Emod_mbox/flink-dev/)

JIRA

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-19582
 
Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-19614

...

Several new config options will be added to control the behavior of the sort-merge based blocking shuffle and by disable sort-merge based blocking shuffle by default, the default behavior of blocking shuffle stays unchanged.

Config OptionDescription
taskmanager.network.sort-merge-blocking-shuffle.max-files-per-partitionThe maximum number of files can be produced by each sort-merge blocking partition, files over this threshold will be merged.
taskmanager.network.sort-merge-blocking-shuffle.buffers-per-partitionNumber of network buffers required for each sort-merge blocking result partition. Larger value can reduce the number of shuffle files and bring better performance.
taskmanager.network.sort-merge-blocking-shuffle.min-parallelismFor small parallelism, hash-based blocking shuffle will be used and for large parallelism, sort-merge based blocking shuffle will be used.

A fixed number of network buffers per result partition makes the memory consumption decoupled with parallelism which is more friendly for large scale batch jobs.

...

The default behavior of Flink stays unchanged. Nothing need to do when migrating to new Flink version.

Appendix

...