Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Recently Kafka community is promoting cooperative rebalancing to mitigate the pain points in the stop-the-world rebalancing protocol and an initiation for Kafka Connect already started as KIP-415. There is are already exciting discussion discussions around it, but for Kafka Streams, the delayed rebalance is not the complete solution. This KIP is trying to customize the cooperative rebalancing approach specifically for KStream application context,  based on the great design for KConnect.

...

We shall assign tasks in the order of: active, learner and standby. The assignment will be broken down into following steps:

Assign active stateful tasks:

  1. Assign to learner tasks that indicates "ready"
  2. Assign to previous owners
  3. Assign to unready learner tasks owners
  4. Assign to resource available hosts

Assign active stateless tasks:

Same as above steps 2, 4

Assign learner tasks:

  1. Keep current learner tasks the same. We will not handle half way bounce at least in the first version.
  2. If the load is not balanced between hosts, assign learner tasks from hosts with heavy loads to hosts with lightweight tasks.
  3. As long as the group members/ number of tasks are not changing, there should be a defined balanced stage instead of forever rebalancing.
  4. Instances with standby tasks have higher priority to be chosen as learner task assignor. The standby task will convert to learner task immediately.

...