Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

A corner case is that A & B could be dropping off the group at near time. In static membership, we still need to sync group to make sure how many existing members are still alive, otherwise unnecessary rebalance will trigger later.

Another case is adding Adding new static memberships (scale up!) should be straightforward. This operation should be happening fast enough (to make sure capacity could catch up quickly), we are defining another config called expansion timeout.

...

In this example unfortunately, we triggered two rebalances, because C is too late to catch first round expansion timeout. When C finally joins, it triggers the counter of expansion timeout. After 5 min, another rebalance kicks in and assign new tasks to C. 

Removing members are tricky (scale down). For broker the information of the "target scale down" is very hard to get, for example if we have 16 members and we want to cut the number by half, during the group shrink 16 → 8 it is unknown to the broker coordinator when to trigger rebalance. An admin API to force rebalance should be helpful here.

Fault-tolerance of static membership

...