Status

Current state: Accepted

Discussion thread: [DISCUSS] KIP-281: ConsumerPerformance: Increase Polling Loop Timeout and Make It Reachable by the End User

Vote thread: [VOTE] KIP-281: ConsumerPerformance: Increase Polling Loop Timeout and Make It Reachable by the End User

Vote results: 

JIRA

PULL REQUEST: https://github.com/apache/kafka/pull/4818

Motivation

ConsumerPerformance fails to consume all messages on topics with large number of partitions due to a relatively short default polling loop timeout (1000 ms) that is not reachable and modifiable by the end user. 

Demo: Create a topic of 10 000 partitions, send a 50 000 000 of 100 byte records using kafka-producer-perf-test and consume them using kafka-consumer-perf-test (ConsumerPerformance). You will likely notice that the number of records returned by the kafka-consumer-perf-test is many times less than expected 50 000 000.

As the result, in some rough cases, it may take a long enough time to process/iterate through the records polled in batches, thus the time may exceed the default hardcoded polling loop timeout (1 second), forcing the utility to stop it's execution much earlier than it should to.

This leads to confusion, poor and unexpected test results and is probably not what we want. Thus:

Public Interfaces

Providing the command line tool kafka-consumer-perf-test.sh with an optional --timeout parameter: 

parameter namedefaults totypetime unitusage
--timeout10000Longmsoptional

Proposed Changes

Proposed changes in code available in this PR: https://github.com/apache/kafka/pull/4818

Compatibility, Deprecation, and Migration Plan

Rejected Alternatives

None.