Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Tag NameTask TypeExplanationprerequisites
isStatefulbothIndicate whether given task has a state to restore.N/A
isLearnerstandbyIndicate whether standby task is a learner task.isStateful = True
beingLearnedactiveIndicate whether active task is being learned by some other stream worker.isStateful = True
isReadystandbyIndicate whether standby task is ready to serve as active task.isLearner = True and isStateful = True
isLeavingactiveIndicate whether active task will be leaving the group soon.isStateful = True


Algorithm

...

Details

The above examples are focusing more on demonstrating expected behaviors with KStream incremental rebalancing "end picture". However, we also want to define the new learner algorithm for a holistic view.

We have set of workers, where each worker contains: 

We shall assign tasks in the order of: active, learner and standby. The assignment will be broken down into following steps:

...

  1. Will mostly remain the same as current, which we will just pick resource abundant hosts to allocate standby tasks. 
  2. For tasks that get transferred after learner tasks finish, we could assign standby tasks right to the degraded hosts which hold previous round of active tasks.

Version 1.0

We will care more about smooth transition over resource balance for stage one. This is because we do have some historical discussion on marking weight for different types of tasks. If we go ahead to aim for task balance too early, we are potentially in the position of over-optimization.

So we only gonna assign learner task only to "new comers", which means every stream worker will denote itself as "new member" when they don't have local information. The assignment, however, will be on the the task .

Version 2.0

We will take task balance into consideration. Specifically for each instance, the total number of tasks it owns should be within the range of 0.5 ~ 2 times of the expected number of tasks, which buffers some capacity in order to avoid imbalanc

...

We want to callout this portion because the algorithm we are gonna design is fairly complicated so far. To make sure the delivery is smooth with fundamental changes of KStream internals, I build a separate Google Doc here that could be sharable to outline the step of changes. Feel free to give your feedback on this plan while reviewing the algorithm, because some of the changes are highly coupled with internal changes. Without these details, the algorithm is not making sense.

...