...
Current state: "Under Discussion"
Discussion thread: https://lists.apache.org/thread/4p3xcf0gg4py61hsnydvwpns07d1nog7
JIRA:
Jira | ||||||
---|---|---|---|---|---|---|
|
...
As mentioned by FLINK-14396: As long as there is at-least one available buffer in LocalBufferPool, the RecordWriter is available for network output in most cases
. So it can only solve the scenario where only one buffer is needed to process a single record. When the back pressure is severe, if multiple output buffers are required to process a single record, the Task may still be blocked on requestMemory, resulting in Checkpoint not being able to complete quickly. For example:
- Big record which might span multiple buffers
- Flatmap-like operators which might emit multiple records in every process
- Broadcast watermark which might request multiple buffers at a time
In this FLIP, we propose to add the overdraft buffer in order to reduce the probability of Task being blocked in requestMemory when multiple output buffers are required to process a single record.
Overdraft Buffer mechanism: When LocalBufferPool#requestMemory is called and LocalBufferPool is insufficient, LocalBufferPool will allow Task to overdraw some MemorySegments and LocalBufferPool will not be available. The LocalBufferPool cannot become available until all the overdraft buffers are consumed by downstream tasks and the LocalBufferPool has recycled these overdraft buffers.
Public Interfaces
Proposed Changes
...