You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

This page is meant as a template for writing a KIP. To create a KIP choose Tools->Copy on this page and modify with your content and replace the heading with the next KIP number and a description of your issue. Replace anything in italics with your own description.

Status

Current state: Under Discussion

Discussion thread: here

JIRA: Unable to render Jira issues macro, execution error.

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Among the out-of-the-box consumer group partition assignment strategies, Sticky Assignor is the only one that is stateful. This is because in this assignment strategy one main goal is to preserve a consumer's assigned partitions as much as possible through rebalances; and for that each consumer reports its current assignment state to the leader during a rebalance. The leader then starts with the reported current assignments from all consumers in the group, and generates their new assignments with the goal of fairness and stickiness, in that particular order. According to the issue reported in KAFKA-7026 and the discussion that followed in the corresponding PR, there are rare cases where the current implementation could lead to assigning the same partition to multiple consumers, breaking the consumer group guarantee. Examples of how it breaks the guarantee exist in the PR discussion, but to summarize, if a consumer briefly leaves the group and then rejoin the group (thinking it has held onto its previous assignment while in reality its partitions were redistributed to other consumers as a result of it leaving the group), it will report its previous assignment to the leader and the leader would not have a way of detecting that the reported assignment is no longer valid. This reported stale assignment can easily conflict with reported (valid) assignments of other consumers and there will be a chance that the same partition will stay assigned to the consumers that reported it as part of their assignment, leading to a duplication. Note that this scenario would be a rare, and one way of reproducing it is running a consumer in debug mode and pausing it for long enough that causes a rebalance.

Public Interfaces

This is the existing user data protocol of Sticky Assignor.

User Data (Version: 0) => [previous_assignment]
  previous_assignment => topic [partitions]
    topic => STRING
    partitions => partition
      partition => INT32

Following each rebalance, the group leader sends each consumer a list of <topic, partitions> as its current partition assignment. The consumer sends that list back to the leader when the next rebalance starts.

Proposed Changes

In order to have an indication of how fresh a reported assignment by each consumer is the following protocol is suggested.

User Data (Version: 1) => [previous_assignment] generation
  previous_assignment => topic [partitions]
    topic => STRING
    partitions => partition
      partition => INT32
  generation => INT32

The only addition is a numeric generation marker that increases during each improvement. This lets the leader detect the highest generation and ignore a consumer's reported assignment (if necessary) when that assignment belongs to older generations.

Generation Marker

There are two methods for implementing this generation marker:

  1. A Sticky Assignor specific generation marker with the scope limited to Sticky Assignor (on the client side) only. This would mean that the generation marker of a sticky assignment may occasionally reset to 0 if no consumer in the group reports a previous assignment. But that does not jeopardize the logic presented above in resolving the scenarios that could lead to duplicate assignments. The upside of this method is the solution has limited scope and is specific to Sticky Assignor and other assignors or the AbstractPartitionAssignor class will not be impacted.
  2. The consumer group's generation number can also be used for this purpose. The upside of this approach is that the assignment generation marker will be in-sync with consumer group's generation number. Also, the group's generation number, once available in the assignors, could later be used by other stateful partition assignors. The downside is added implementation complexity in exposing the consumer group's generation number to the Sticky Assignor which would potentially impact all existing assignor classes.  

One of these two methods will be used in the implementation.

Compatibility, Deprecation, and Migration Plan

The new protocol can be implemented in a way that supports backward and forward compatibility.

  • Backward compatibility: If the leader uses the new protocol it first tries to deserialize reported assignments of consumers using the new protocol. If that does not work it falls back to using the old protocol assuming a default (e.g. -1) generation number for the assignment.
  • Forward compatibility: If the leader uses the old protocol it deserializes only the array portion of the reported assignment byte array ignoring the potential integer that could be following. Generation markers reported by new clients will all be ignored in this case.

Rejected Alternatives

N/A.

  • No labels