Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • State transfer starts when members(M’) connect to leader(M), before reconfig(M’) is invoked. However, some suffix of operations might not have been transferred to M’ before the reconfiguration began. Since no new operations can be committed in M’ before this state-transfer completes, we’d like to minimize this tail. A possible way to do that is to require that at most w operations scheduled before the reconfiguration are unknown to the connected quorum of M’ as a prerequisite (line 1).
  • To prevent a violation of Global Primary Order, we can't have leader(M) committing operations in M', but it can propose the operations and leader(M') can then commit themthese operations.
  • Operations arriving to leader(M) during phase-1 are sent to both M and M'. Because leader(M) can fail during phase-1, we must ensure that either a quorum of M got the operation or (a quorum of M' got it AND M' was activated). Completion of phase-2 indicates the latter. Having said that, acking the client is problematic after phase-2: currently, ZOOKEEPER only allows a server to be connected to one leader and those servers in M that also belong to M' will no longer be listening to messages from leader(M) after receiving the phase-2 message. ZOOKEEPER-22 currently prevents clients from finding out the status of their operations submitted in M by connecting to M'. We should discuss how to resolve this point.
  • If phase 1 is done using a normal ZAB proposal, explicitly making sure that there are no incomplete reconfiguration requests that remain in the system after a recovery might not be necessary.
  • phase-3 is not needed for correctness, however, if executed, it is important to execute it after phase-2 completes, since we want to make sure that M' is active before M goes away (otherwise the system can get stuck).

4.2. Recovery from leader failure

...