Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current stateUnder Discussion

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

The KafkaConsumer API centers around the poll() API which is intended to be called in a loop. On every iteration of the loop, poll() returns a batch of records from the partitions this consumer can retrieve at that time. The size of returned records is determined by the max.poll.records, as described in KIP-41: KafkaConsumer Max Records. Currently the implementation will return available records starting from the last partition the last poll call retrieves records from. This leads to unfair patterns of record consumption from multiple partitions.

This proposal discusses a mechanism to mitigate that issue. 

Public Interfaces

No public interface changes is proposed.

Proposed Changes

The issue stems from the greedy consumption of a partition in serving a poll call, as described in Ensuring Fair Consumption of KIP-41, to be used again in the next poll call, and so continue that greedy behavior against that previous partition in the next call.

...