Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Instead of line 1, try to connect to M’ and transfer state, so that a quorum of M’ is connected and up-to-date. If unsuccessful, skip the reconfiguration.
  • If a quorum of M’ indicate that they’ve already activated M’ then skip to phase 3
  • Otherwise, if next(M) = M’ was returned by a quorum of M, skip to phase 2

The server s that submitted the reconfig operation to the leader should check what's the current configuration after a new leader is established. If it is still the old one it can re-submit the reconfig. If s fails, the client can re-submit the reconfig to another server. Recall that version(M) is included by the client in the request so a server receiving it should first check that the configuration is still M (using local state) before submitting it to the leader, and the leader should also make this check (since it has the latest state).

4.3. Reconfiguration API

A choice that has to be made is what kind of operations to support – incremental changes like “add server A” or “remove server B” (e.g., as in survey on reconfiguration of R/W objects , #DynaStore) or full membership specification, as in “reconfigure so that the membership becomes is A, B, C” (e.g., survey on reconfiguration with virtual synchrony , #Rambo).

...

The idea of an “off-line” strategy for reconfiguration (survey on reconfiguration with with virtual synchrony , survey on reconfiguring state-machine replication ) is to stop operations in the old configuration, transfer the state to the new configuration and then enable operations – in the new configuration. In contrast, an online reconfiguration approach (#RAMBO, #DynaStore) never stops the service while reconfiguring.
One of the complexities arising in the online approach is that a normal operation can be executing concurrently with a reconfiguration, however the state still must be transferred correctly to the next configuration. The easy case is when the operation occurs in the old configuration and the reconfiguration transfers the state. It is possible, however, that the reconfiguration misses the operation when it transfers the state and completes. In this case, existing online reconfiguration solutions (#RAMBO, #DynaStore) continue the operation and execute it in the new configuration.
Unfortunately this may violate the global primary order in Zookeeper - operations issued in the new configuration (potentially by a different primary) may have already completed, in which case global primary order does not allow operations issued by an old primary to be applied.
We therefore choose the offline reconfiguration strategy, however we try to minimize the period of unavailability by pre-transferring the state to the new configuration before the reconfig begins.

4.7. Other issues

4.7.1. Bootstrapping the cluster

We should be able to bootstrap ZOOKEEPER using reconfig API instead of configuration files as it is currently done.

4.7.2. Formal Liveness Specification

Should be similar to the one in survey on reconfiguration with with virtual synchrony and #DynaStore.

4.7.3. Informing clients and servers about new configuration

For clients: Pull-based (e.g., DNS) ? Push-based (e.g., transferring client list as part of state to M' and having leader(M') inform them) ?
For servers in M', should the server periodically broadcast the new configuration (or its id), to try to makes sure that all servers in M' know about M' ?

Its probably good to include the version of the current config in all client operation requests. This way if the server knows about a later config (which is active) it can let the client know.

4.6. Bibliography

Surveys:
1.

Anchor
VSSurvey
VSSurvey
Ken Birman, Dahlia Malkhi, and Robbert Van Renesse, Virtually Synchronous Methodology for Dynamic Service Replication, no. MSR-TR-2010-151, November 2010 paper
2.
Anchor
SMRSurvey
SMRSurvey
Leslie Lamport, Dahlia Malkhi and Lidong Zhou, Reconfiguring a State Machine. In SIGACT News 41(1), SIGACT News 41(1): 63-73 (2010) paper
3.
Anchor
ASSurvey
ASSurvey
Marcos K. Aguilera, Idit Keidar, Dahlia Malkhi, Jean-Philippe Martin, Alexander Shraer:
Reconfiguring Replicated Atomic Storage: A Tutorial. In the Bulletin of the European Association for Theoretical Computer Science 102, pages 84-108, Distributed Computing Column, October 2010. paper

...