Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • State transfer starts when members(M’) connect to leader(M), before reconfig(M’) is invoked. However, some suffix of operations might not have been transferred to M’ before the reconfiguration began. Since no new operations can be committed in M’ before this state-transfer completes, we’d like to minimize this tail. A possible way to do that is to require that at most w operations scheduled before the reconfiguration are unknown to the connected quorum of M’ as a prerequisite (line 1).
  • When line 8 completes we know that operations scheduled before the reconfiguration are committed in M’.
  • Even if the current leader remains the leader of M’ we cannot allow operations to be executed in M’ before phase 1 ends, otherwise, if the leader fails, we have a spilt brain (some operations execute in M’ and then when a new leader recovers new ops will be executed in M).
  • To prevent a violation of Global Primary Order, we can't have leader(M) committing operations in M', but it can propose the operations and leader(M') can commit them.
  • Operations arriving to leader(M) during phase-1 are sent to both M and M'. Because leader(M) can fail during phase-1, we must ensure that either a quorum of M got the operation or (a quorum of M' got it AND M' was activated). Completion of phase-2 indicates the latter. Having said that, acking the client is problematic after phase-2: currently, ZOOKEEPER only allows a server to be connected to one leader and those servers in M that also belong to M' will no longer be listening to messages from leader(M) after receiving the phase-2 message. ZOOKEEPER-22 currently prevents clients from finding out the status of their operations submitted in M by connecting to M'. We should discuss how to resolve this pointInstead of lines (2b) and (3b) the leader of M can redirect further operations to leader(M’) (whether leader(M') is equal to leader(M) or not). Leader(M') will buffer them until M’ is activated.
  • If phase 1 is done using a normal ZAB proposal, explicitly making sure that there are no incomplete reconfiguration requests that remain in the system after a recovery might not be necessary.

...