You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Summary and Motivation

Thread interleaving hinders lots of race and locking issues that we are observing today, as a result of the coupling behaviors between the polling thread and the heartbeat thread.  Here we propose a redesign of the consumer threading model, with the goal of simplifying the logic of how different tasks and events are handled.

In the new design, we will decouple the responsibilities of the polling thread and the background thread.  In particular, we propose making the polling thread responsible for tasks such as API calls, data serialization and deserialization, and making the background thread responsible for network IO, consumer group management and heartbeat.

Definitions

Polling Thread

The main user thread.  It is called polling thread because it is where the thread user invokes the poll() operation.

Background Thread

The thread that is running in the background upon initialization of the KafkaConsumer.

Heartbeat Thread

The background thread in the current implementation (see KIP-62).

KafkaServerEvent

Events consumed by the background thread, which are then being sent to Kafka Server.  Likewise, KafkaServerEventQueue is the channel that transports these events to the background thread.

KafkaConsumerEvent

Events consumed executed on the consumer.  KafkaConsumerEventQueue indicates the channel that transports consumer events. Example: notifyRebalance

Proposed Design

A few changes to highlight here:

  1. Polling thread and background thread own a distinctive set of objects, which means, they will be responsible for different set of tasks
  2. Polling thread and Background thread communicate via channelsin form of events.
  3. Group membership and coordinator state will be maintained by the background thread
  4. Group membership is managed by a state machine
  5. The public API remains unchanged
  6. We will respect the current contract, in particular
    1. rebalance callback will still be executed by the polling thread

  • No labels