Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In KStream the scale up is difficult since we need to shuffle the state to new hosts. In ideal case, we could actually introduce a new status called "learner" where the newly up hosts could try to catch up with the assigned task progress first before triggering the rebalance, from which we don't see a sudden dip on the progress. However, it is built on top of the success of KIP-345 and could be addressed in a separate KIP specifically for KStream.

Effectively, we are using expansion timeout to replace rebalance timeout, which is configured by max.poll.intervals from client side, and using registration timeout to replace session timeout.

Fault-tolerance of static membership

To make sure we could recover from broker failure/leader transition, an in-memory member name map is not enough. We will init another topic called `static_member_map` in the cluster and each time we have a stable state we should write the complete mapping information as a single message into it.  This We could otherwise reuse the `_consumer_offsets` topic to store this information, which however could be unfavorable if we are making too many changes to existing offset message schema. Either way when another broker takes over the leadership, we could transfer the mapping together.

...

For the very first version, we hope to make transferring logic simple enough. As we have mentioned, the goal of static membership is to reduce multiple rebalances with intermittent failures. There should be an easy way to fallback to dynamic membership when user prefers to let broker handle task assignments, for example liveness becomes a more important factor when the state tuning has big progress. The fallback is simple: when the membership becomes stable, the first join group request will decide the membership protocol for next round. For example, when we are running stable with static membership, deploy a new change to the client app without member.name being set could invalidate the current hashmap on broker, and all the v4 request containing member.name will be treated as a normal dynamic member and perform the sticky assignment in dynamic context as usual. Note that we intentionally separate session.timeout and registration.timeout so user doesn't have to change any setting except member name, although they are serving the same purpose in different protocols.

...