Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Keep the existing learner tasks unless running. We don't want to see half way bounce.
  2. If the load is not balanced between hosts, assign learner tasks from hosts with heavy loads to hosts with least active lightweight tasks.
  3. As long as the group members/ number of tasks are not changing, there should be a defined balanced stage instead of forever rebalancing.
  4. Instances with standby tasks have higher priority to be chosen as learner task assignor. The standby task will convert to learner task immediately.

...

We could even provide a stream.balancing.factor for the user to configure. Default as we have seen is 2. The smaller this number sets to, the more strict the assignment will behave.  If the factor is set to r, the number of tasks a host could own is (w/total tasks)

As we could see, there should be only exactly one learner task after each round of rebalance, and there should be exactly one corresponding active task at the same time. 

...

We will be adding following new configs:

learn.partial.rebalance

Default : true

If this config is set to true, new member will proactively trigger rebalance when it finishes restoring one task state each time, until it eventually finishes all the task replaying. Otherwise, new worker will batch the ready stage to ask for single round of rebalance.


scale.down.timeout.ms

Default: infinity

Timeout in milliseconds to force terminate the stream worker when informed to be scaled down.


stream.balancing.factor

Default: 2

The tolerance of task imbalance factor between hosts to trigger rebalance.


to help user define their customized strategy.

...