Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For KStream, we are going to take a trade-off between “revoking all” and “revoking none” solution: we shall only revoke tasks that are being learned since last round. So when we assign learner tasks to new member, we shall also mark active tasks as "being learned task" on current owners. Every time when a rebalance begins, the task owners will revoke the being learned tasks and join group without affecting other ongoing tasks. This way learned tasks could immediately transfer ownership without attempting for a second round of rebalance. Compared with KIP-415, we are optimizing for fewer rebalances, but increasing the metadata size and sacrificing partial availability of the learner tasks. 

Algorithm Walkthrough

Let's use some annotation to define the new learner algorithm for a holistic view.

LaTeX Formatting
$ax^2 = 5;$

As we could see, there should be only exactly one learner task after each round of rebalance, and there should be exactly one corresponding active task at the same time. 


Next we are going to look at several typical scaling scenarios to better understand the algorithm.

...

Sometimes end user wants to reach a sweet spot between ongoing task transfer and streaming resource free-up. So we want to take a similar approach as KIP-415, where we shall introduce a client config to make sure the scale down is time-bounded. If the time takes to migrate tasks outperforms this config, the leaving member will shut down itself immediately instead of waiting for the final confirmation. And we could simply transfer learner tasks to active because they are now the best shot to own new tasks.

Algorithm Walkthrough

The above examples are focusing more on demonstrating expected behaviors with KStream incremental rebalancing.  We also want to define the new learner algorithm for a holistic view.



As we could see, there should be only exactly one learner task after each round of rebalance, and there should be exactly one corresponding active task at the same time. 



Algorithm Trade-offs

We open a special section to discuss the trade-offs of the new algorithm, because it's important to understand the change motivation and make the proposal more robust. 

...