You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »


Status

Current state: Under Discussion

Discussion thread: link

JIRA: KAFKA-2273

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

 

Motivation

In certain circumstances the round robin assignor, which produces better assignments compared to range assignor, fails to produce an optimal and balanced assignment of topic partitions to consumers. KIP-49 touches on some of these circumstances. In addition, when a reassignment occurs, none of the existing strategies consider what topic partition assignments were before reassignment, as if they are about to perform a fresh assignment. Preserving the existing assignments could reduce some of the overheads of a reassignment. For example, Kafka consumers retain pre-fetched messages for partitions assigned to them before a reassignment. Therefore, preserving the partition assignment could save on the number of messages delivered to consumers after a reassignment.

 

Public Interfaces

This assignment strategy, which is implemented for the new consumer, would add a StickyAssignor class that can be used as org.apache.kafka.clients.consumer.StickyAssignor for the value of the consumer property partition.assignment.strategy. It would not affect the default value of this consumer property.

 

Proposed Changes

Add a Sticky Assignor option to the potential assignment strategies of the new consumer. The Sticky Assignor serves two purposes.

First, it guarantees an assignment that is as balanced as possible, meaning either:

  • the numbers of topic partitions assigned to consumers differ by at most one; or
  • if a consumer A has 2+ fewer topic partitions assigned to it compared to another consumer B, none of the topic partitions assigned to B can be assigned to A.

When starting a fresh assignment, the Sticky Assignor would distribute the partitions over consumers as evenly as possible. Even though this may sound similar to how round robin assignor works, the second example below shows that it results in a more balanced assignment.

Second, during a reassignment the Sticky Assignor would perform the reassignment in such a way that in the new assignment,

  1. topic partitions are still distributed as evenly as possible, and
  2. topic partitions stay with their previously assigned consumers as much as possible.

Of course, the first goal above takes precedence over the second one. This means it is possible that a few topic partitions cannot remain assigned to the same consumer and have to switch to another consumer in order to guarantee the most balanced assignment possible.

With the Sticky Assignor, the reassignment is performed by

  1. preserving all the existing partition assignments
  2. removing all the partition assignments that have become invalid due to the change that triggers the reassignment
  3. assigning the unassigned partitions in a way that balances out the overall assignments of partitions to consumers
  4. further balancing out the resulting assignment by finding the partitions that can be reassigned to another consumer towards an overall more balanced assignment.

 

Example 1

Suppose there are three consumers C0, C1, C2, four topics t0, t1, t2, t3, and each topic has 2 partitions, resulting in partitions t0p0, t0p1, t1p0, t1p1, t2p0, t2p1, t3p0, t3p1. Each consumer is subscribed to all three topics.

 

The assignment with both sticky and round robin assignors results in

ConsumerAssigned Topic Partitions
C0t0p0, t1p1, t3p0
C1t0p1, t2p0, t3p1
C2t1p0, t2p1

 

Now, let's assume that consumer C1 is removed and a reassignment occurs. The round robin assignor would produce

ConsumerAssigned Topic Partitions
C0t0p0, t1p0, t2p0, t3p0
C2t0p1, t1p1, t2p1, t3p1

 

The sticky assignor, on the other hand, would result in

ConsumerAssigned Topic Partitions
C0t0p0, t1p1, t3p0, t2p0
C2t1p0, t2p1, t0p1, t3p1

preserving 5 of the previous assignments (unlike the round robin assignor which preserves only 3).

 

Example 2

There are three consumers C0, C1, C2, and three topics t0, t1, t2, with 1, 2, and 3 partitions, respectively. Therefore, the partitions are t0p0, t1p0, t1p1, t2p0, t2p1, t2p2. C0 is subscribed to t0; C1 is subscribed to t0, t1; and C2 is subscribed to t0, t1, t2.

 

The round robin assignor would result in the following assignment:

ConsumerAssigned Topic Partitions
C0t0p0
C1t1p0
C2t1p1, t2p0, t2p1, t2p2

which is not as balanced as the assignment produced by the sticky assignor:

ConsumerAssigned Topic Partitions
C0t0p0
C1t1p0, t1p1
C2t2p0, t2p1, t2p2

 

Now, if consumer C0 is removed, these two assignors would produce the following assignments.

Round Robin (preserves 3 partition assignments):

ConsumerAssigned Topic Partitions
C0t0p0, t1p1
C2t1p0, t2p0, t2p1, t2p2

 

Sticky (preserves 5 partition assignments):

ConsumerAssigned Topic Partitions
C0t1p0, t1p1, t0p0
C2t2p0, t2p1, t2p2

 

Not only the sticky assignor preserves more assignments, it also results in a more balanced assignment.

 

Compatibility, Deprecation, and Migration Plan

This proposal would add a new option to existing assignment strategies of the new consumer. It would not impact any existing functionality.


Rejected Alternatives

N/A.

  • No labels