Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Leader crash could cause a missing of historical assignment information. For the learner assignment, however, each worker maintains its own assignment status, so when the learner task's id has no active task running, the transfer will happen immediately. Leader switch in this case is not a big concern. The essence is that we don't rely on leader information to do

the assignment.

Code Block
languagetext
titleScale-down stream applications
Cluster has 3 stream workers S1(leader), S2, S3, and they own tasks 1 ~ 5
Group stable state: S1[T1, T2], S2[T3, T4], S3[T5]

#First Rebalance 
New member S4 joins the group
S1 performs task assignments:
	S1(assigned: [T1, T2], revoked: [], learning: [])
	S2(assigned: [T3, T4], revoked: [], learning: [])
	S3(assigned: [T5], revoked: [], learning: [])
	S4(assigned: [], revoked: [], learning: [T1])


#Second Rebalance
S1 crashes/gets killed before S4 is ready, S2 takes over the leader.
Member S2~S4 join with following status: 
	S2(assigned: [T3, T4], revoked: [], learning: [])
	S3(assigned: [T5], revoked: [], learning: []) 
	S4(assigned: [], revoked: [], learning: [T1])
Note that T2 is unassigned, and S4 is learning T1 which has no current active task. We 
could rebalance T1, T2 immediately.	
S2 performs task assignments:
	S2(assigned: [T3, T4], revoked: [], learning: [])
	S3(assigned: [T5, T2], revoked: [], learning: [])
	S4(assigned: [T1], revoked: [], learning: [])
Now the group reaches balance.

...

The above examples are focusing more on demonstrating expected behaviors with KStream incremental rebalancing.  We also want to define the new learner algorithm for a holistic view.

We have set of workers, where each worker contains:

Workers = []

Set<ActiveTask>

Set<StandbyTask> 




As we could see, there should be only exactly one learner task after each round of rebalance, and there should be exactly one corresponding active task at the same time. 

...