THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
Code Block | ||||
---|---|---|---|---|
| ||||
Cluster has 3 stream instances S1(leader), S2, S3, and they each own some tasks 1 ~ 5
Group stable state: S1[T1, T2], S2[T3, T4], S3[T5]
#First Rebalance
New member S4 joins the group
S1 performs task assignments:
S1(assigned: [T1, T2], revoked: [], learning: [])
S2(assigned: [T3, T4], revoked: [], learning: [])
S3(assigned: [T5], revoked: [], learning: [])
S4(assigned: [], revoked: [], learning: [T1])
#Second Rebalance
New member S5 joins the group.
Member S1~S5 join with following metadata: (S4 is not ready yet)
S1(assigned: [T2], revoked: [T1], learning: []) // T1 revoked because it's "being learned"
S2(assigned: [T3, T4], revoked: [], learning: [])
S3(assigned: [T5], revoked: [], learning: [])
S4(assigned: [], revoked: [], learning: [T1])
S5(assigned: [], revoked: [], learning: [T3])
S1 performs task assignments:
S1(assigned: [T1, T2], revoked: [], learning: [])
S2(assigned: [T3, T4], revoked: [], learning: [])
S3(assigned: [T5], revoked: [], learning: [])
S4(assigned: [], revoked: [], learning: [T1])
S5(assigned: [], revoked: [], learning: [T3])
#Third Rebalance
Member S4 finishes its replay and becomes ready, re-attempt to join the group. S5 is not ready yet.
Member S1~S5 join with following status:(S5 is not ready yet)
S1(assigned: [T2], revoked: [T1], learning: [])
S2(assigned: [T4], revoked: [T3], learning: []) // T3 revoked because it's "being learned"
S3(assigned: [T5], revoked: [], learning: [])
S4(assigned: [], revoked: [], learning: [T1])
S5(assigned: [], revoked: [], learning: [T3])
S1 performs task assignments:
S1(assigned: [T2], revoked: [T1], learning: [])
S2(assigned: [T3, T4], revoked: [], learning: [])
S3(assigned: [T5], revoked: [], learning: [])
S4(assigned: [T1], revoked: [], learning: [])
S5(assigned: [], revoked: [], learning: [T3])
#Fourth Rebalance
Member S5 is ready, re-attempt to join the group.
Member S1~S5 join with following status:(S5 is not ready yet)
S1(assigned: [T2], revoked: [], learning: [])
S2(assigned: [T4], revoked: [T3], learning: []) // T3 revoked because it's "being learned"
S3(assigned: [T5], revoked: [], learning: [])
S4(assigned: [T1], revoked: [], learning: [])
S5(assigned: [], revoked: [], learning: [T3])
S1 performs task assignments:
S1(assigned: [T2], revoked: [], learning: [])
S2(assigned: [T4], revoked: [T3], learning: [])
S3(assigned: [T5], revoked: [], learning: [])
S4(assigned: [T1], revoked: [], learning: [])
S5(assigned: [T3], revoked: [], learning: [])
Now the group reaches balance with 5 members and 5 tasks. |
...
- Which task is learner task? This could be a tag on standby task as "isLearner".;
- Which task is being learned? This could be a tag on active task as "isLearned".;
- Which learner task has become ready? This could be a tag on standby task as "isReady".
Optimizations
Stateful vs Stateless Tasks
For stateless tasks the ownership transfer should happen immediately without the need of a learning stage, because there is nothing to restore. We should fallback the algorithm towards KIP-415 where the stateless task will only be revoked during second rebalance.
Public Interfaces
We will be adding following new configs:
...