Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Along with adding next node to failedNodes map server picks up from the ring next server right after the failed and tries to establish connection with it. In case of success found server becomes new next, otherwise the process repeats from the beginning: adding node to failedNodes map, picking up next to failed and so on.When ring is restored and next alive server node found, current node adds

Info

...

Info
titleTcpDiscoveryAbstractMessage and its responsibilities

Any disco message can inform about failed nodes in topology (see TcpDiscoveryAbstractMessage#failedNodes collection and logic around) and all nodes receiving any disco message start processing it from updating their local failedNodes maps with info from that disco message (see ServerImpl$RingMessageWorker#processMessage where method processMessageFailedNodes is called almost at the beginning).

When ring is restored and next alive server node found, current node adds info about all nodes from failedNodes map to any discovery message it is about to send and sends it (see how this field is handled in ServerImpl$RingMessageWorker#sendMessageAcrossRing method).

Also server detected failed node is responsible for creating TcpDiscoveryNodeFailedMessage - special discovery message that starts second step of removing failed node. One message is created for a failed node. So if three nodes has failed, three TcpDiscoveryNodeFailedMessage messages are created.

Step two

Second step starts on coordinator.

...