Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Describe the problems you are trying to solve.

Public Interfaces

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed. The purpose of this section is to concisely call out the public contract that will come along with this feature.

A public interface is any change to the following:

  • Binary log format

  • The network protocol and api behavior

  • Any class in the public packages under clientsConfiguration, especially client configuration

    • org/apache/kafka/common/serialization

    • org/apache/kafka/common

    • org/apache/kafka/common/errors

    • org/apache/kafka/clients/producer

    • org/apache/kafka/clients/consumer (eventually, once stable)

  • Monitoring

  • Command line tools and arguments

  • Anything else that will likely break existing users in some way when they upgrade

Proposed Changes

This KIP documents the updated threading model of the Consumer implementation of the client.

The complexity of the consumer has increased and with it the code to support and fix bugs. Patches and hotfixes in the past years have heavily impacted the readability of the code.  The complex code path and intertwined logic make the code difficult to modify and comprehend. Additionally, logic is at times executed on application threads and at other times on the dedicated, internal heartbeat thread. The asynchronous nature of the current implementation has lead to many bugs (which are labeled with the new-consumer-threading-should-fix label. The motivation is to simplify the structure of the code by clearly defining—and removing, where possible—the asynchronous code.

The simplification will also allow us to implement the necessary primitives for KIP-848.

Public Interfaces

This KIP has the explicit goal of making no changes to the public interfaces. The protocol, configuration, APIs, etc. will remain as they currently are. The internal behavior of the consumer is substantially changing and we want to ensure it is reviewed and vetted by the community.

Proposed Changes

Terminology

To help understand the design, we need to introduce some terminology. Terms designated with 1 apply to the current KafkaConsumer implementation and terms with 2 apply to the new implementation; a term may apply to both.

  • Application event processor2: Processes the events on the background thread, interacting with the request managers.
  • Application thread1, 2: The thread that is executing the user's code that interacts with the Consumer API. Per the current implementation in KafkaConsumer, only one thread may call APIs at a time.
  • Background thread2: The thread created for each Consumer that is dedicated to executing events on the event handler.  
  • Event2: A data structure specific to each Consumer API call that encapsulates the relevant data. For example, a seek event would include topic information and an offset. 
  • Event handler2: Logic that pulls events from the event queue for processing on the background thread.
  • Event queue2: Queue of events that is shared between the application thread and the background thread, serving as a conduit between the two.
  • Heartbeat thread1: Thread in the existing KafkaConsumer implementation that communicates liveness to the brokers, processes fetch results, etc. Introduced in KIP-62.
  • Network client delegate2: TBD
  • Request manager2: Internal interface that is used by the background thread to handle the management of requested, inflight, and responded network I/O.

Threading Model

<TBD>

Background thread

<TBD>

Providing Data to the Background Thread

<TBD>

Getting Data from the Background Thread

<TBD>

Network I/O

<TBD>Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgement based on the scope of the change.

Compatibility, Deprecation, and Migration Plan

...