You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Current stateUnder Discussion

Discussion thread:  TBD

JIRAhttps://issues.apache.org/jira/browse/KAFKA-7728

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

The goal of this KIP continues the effort in KIP-345 to mitigate unnecessary rebalances in order to achieve better consumer performance. Today consumer group triggers rebalance on the following circumstances:

  1. A new member joins the group with UNKNOWN_MEMBER_ID
  2. A known member joins with changed metadata
  3. A leader rejoins the group
  4. A current member gets session timeout/leaves the group

We already aim to address 1, 4 through KIP-345 by applying group.instance.id to recognize members as static throughout restarts, and avoid sending LeaveGroupRequest when the member is under static membership, thus only using session timeout to kick off expired members. For circumstance 2, it is clear that if a member has metadata update such as assignment protocol change should require another rebalance to address this change as necessary. However, we have space to improve circumstance 3, because it is not a valid condition to trigger rebalance for the most time when the leader instance is just doing a restart under static membership. It is beneficial to distinguish whether leader is rejoining for the sake of rebalance, or is rejoining just due to service restart. By specifying the join reason of the request could entirely avoid rebalance during normal consumer bounces.

Furthermore, as we are promoting incremental rebalances such as KIP-415, later we hope to support stateful consumers such as KStream group to have new member only taking in standby task and give them time to replay the state when first joined. These new followers need to indicate a change of status when they have finished replaying the state. If no JoinReason is specified, brokers will not be able to distinguish the joiner's purpose: whether you are requiring an incremental rebalance, or you are just joining for restart? 

In conclusion, having JoinReason to gracefully handle the problem of rebalance necessity could simplify the implementation logic by a lot, and hide enough details to brokers' perspective on whether to move the group towards PrepareRebalance.  

Public Interfaces

We will add a new enum field to the JoinGroupRequest interface, and bump the version to v6:

JoinGroupRequest => GroupId SessionTimeout RebalanceTimeout MemberId GroupInstanceId ProtocolType GroupProtocols
  GroupId             => String
  SessionTimeout      => int32
  RebalanceTimeout	  => int32
  MemberId            => String
  GroupInstanceId     => String // new
  ProtocolType        => String
  GroupProtocols      => [Protocol MemberMetadata]
  Protocol            => String
  MemberMetadata      => bytes



Proposed Changes

Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgement based on the scope of the change.

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?
  • If we are changing behavior how will we phase out the older behavior?
  • If we need special migration tools, describe them here.
  • When will we remove the existing behavior?

Rejected Alternatives

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.

  • No labels