Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

  • upon receipt of <activate-config, M’, leader(M’)> in configuration M’
    • start serving operations in M’ the first operation should clean up any incomplete reconfig request from M’forward the message to members(M')
    • send <activate-config-ack> to leader(M) and to leader(M')
    • A non-leader server can start processing client operations in M’
  • upon receipt of <activate-config, M’, leader(M’)> from leader(M) and <activate-config-ack> from quorum of members(M'), leader(M') does the following:
    • commit all operations sent by leader(M) to members(M') during phase-1.
    • start processing client operations in M’
  • upon receipt of <retire, version(M)>
    • garbage-collect M

...

The idea of an “off-line” strategy for reconfiguration (survey on reconfiguration with with virtual synchrony , survey on reconfiguring state-machine replication ) is to stop operations in the old configuration, transfer the state to the new configuration and then enable operations – in the new configuration. In contrast, an online reconfiguration approach (#RAMBO, #DynaStore) never stops the service while reconfiguring.
One of the complexities arising in the online approach is that a normal operation can be executing concurrently with a reconfiguration, however the state still must be transferred correctly to the next configuration. The easy case is when the operation occurs in the old configuration and the reconfiguration transfers the state. It is possible, however, that the reconfiguration misses the operation when it transfers the state and completes. In this case, existing online reconfiguration solutions (#RAMBO, #DynaStore) continue the operation and execute it in the new configuration.
Unfortunately this may In Zookeeper, we must be careful not to violate the global primary order in Zookeeper - operations issued in the new configuration (potentially by a different primary) may have already completed, in which case global primary order does not allow operations issued by an old primary to be applied.
We therefore choose the offline reconfiguration strategy, however we try to minimize the period of unavailability by pre-transferring the state to the new configuration before the reconfig beginsproperty - a situation where a new primary commits operations and only then the old primary's commit arrives is not allowed in ZAB. The algorithm avoids that by having leader(M') commit operations proposed by leader(M) during the reconfiguration,
and it does so before committing any other operations that it itself proposed in M'.

4.7. Other issues

4.7.1. Bootstrapping the cluster

...

4.7.3. Informing clients and servers about new configuration

...

We should probably have a DNS-like solution as a fall-back for clients that were not around during the reconfiguration (suppose that all of the servers
that the client knew are already down). For normal operation we should also have a push-base solution to notify clients about the configuration change.
Some preliminary thoughts:

  1. It is preferable to move clients from the old configuration to the new one gradually so that we don't need to set up hundreds of new connections simultaneously.
  2. one possibility could be transferring client list as part of state to M' and having leader(M') inform them
  3. Its probably good to include the version of the current config in all client operation requests. This way if the server knows about a later config (which is active) it can let the client know.

For servers in M', should the server periodically broadcast the new configuration (or its id), to try to makes sure that all servers in M' know about M' ?

4.6. Bibliography

Surveys:
1.

Anchor
VSSurvey
VSSurvey
Ken Birman, Dahlia Malkhi, and Robbert Van Renesse, Virtually Synchronous Methodology for Dynamic Service Replication, no. MSR-TR-2010-151, November 2010 paper
2.
Anchor
SMRSurvey
SMRSurvey
Leslie Lamport, Dahlia Malkhi and Lidong Zhou, Reconfiguring a State Machine. In SIGACT News 41(1), SIGACT News 41(1): 63-73 (2010) paper
3.
Anchor
ASSurvey
ASSurvey
Marcos K. Aguilera, Idit Keidar, Dahlia Malkhi, Jean-Philippe Martin, Alexander Shraer:
Reconfiguring Replicated Atomic Storage: A Tutorial. In the Bulletin of the European Association for Theoretical Computer Science 102, pages 84-108, Distributed Computing Column, October 2010. paper

...