Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Discussion needed: At the time of writing this document, it is assumed that validation protocol is going to be remote.

Implementation details

Based on the initialization status of the nodes in a cluster, there can be several possible scenarios of new nodes entering the topology:

  1. An empty node enters a cluster where all nodes are empty.
  2. An empty node enters a cluster where all nodes are initialized.
  3. An initialized node enters a cluster where all nodes are empty.
  4. An initialized node with a Meta Storage of version X enters a cluster where all nodes have the Meta Storage of version Y.

Common logic

From a bird's eye view, the whole join protocol looks like the following:

  1. A joining node contacts a random node in a cluster and performs the "pre-init" validation by sending some locally available information.
  2. If the first step is successful, the node gets validated against the distributed properties (depending on the cluster state).
  3. If the second step is successful, the node obtains necessary information to set up the Meta Storage and other components.

Depending on the state of the cluster and the joining node, there can be multiple scenarios that influence the behavior of the protocol. They are described below.  

Initial common logic

Regardless of the node state, the initial procedure goes as follows: the A joining node tries to enter the topology. It is possible to piggyback on the transport of the membership protocol in order to exchange validation messages before allowing to send membership messages (similar to the handshake protocol). During this step it sends some information (cluster tag, Meta Storage version, node version) to a random node and gets validated (more details below). After this step is complete, the joining node becomes visible through the Topology Service, therefore establishing an invariant that visible topology will always consist of nodes that have passed the first validation step. Possible issues: there can be a race condition when multiple conflicting nodes join at the same time, in which case only the first node to join will be valid. This can be considered expected behavior, because such situations can only occur during the initial set up of a cluster, which is a manual process and requires manual intervention anyway.

...