Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion thread: https://lists.apache.org/thread/pt2b1f17x2l5rlvggwxs6m265lo4ly7p

JIRA: here (<- link to https://issues.apache.org/jira/browse/FLINK-XXXX)

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-25636

Released: 1.15

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

...

This FLIP only changes the default value of 4 config options related to blocking shuffle, including taskmanager.network.blocking-shuffle.compression.enabled, taskmanager.network.sort-shuffle.min-parallelism, taskmanager.memory.framework.off-heap.batch-shuffle.size, taskmanager.network.sort-shuffle.min-buffers. No new public interface will be added and pipeline streaming mode is not influenced.

Proposed Changes

KeyCurrent ValueTarget ValueReason
taskmanager.network.blocking-shuffle.compression.enabledfalsetrueUsually, data compression can reduce both disk and network IO which is good for performance. At the same time, it can save storage space.
taskmanager.network.sort-shuffle.min-parallelismInteger.MAX_VALUE1For high parallelism batch jobs, sort-shuffle is better for both stability and performance. (We tested setting this value to 1, 128, 256, 512 and 1024 with TPC-DS and the result showed that 1 is the best one.)
taskmanager.memory.framework.off-heap.batch-shuffle.size32m64mAside from the improvement in FLINK-24954, increasing this value can also help to solve the read buffer request issue. (Previously, when choosing the default value, both ‘32M' and '64M' are OK for tests and we chose the smaller one in a cautious way.)
taskmanager.network.sort-shuffle.min-buffers64512The current default value is quite modest and the performance can be influenced especially after we enable sort shuffle for large parallelism batch jobs by default. 

Compatibility, Deprecation, and Migration Plan

...