Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

In the above diagram we see L1 make the decision to become coordinator and create an initial membership view.  Since it has received a JoinRequest from L2 it includes it in this initial view.

 

PlantUML
title
Details of concurrent startup of two locators when the locators are preferred as membership coordinators.  This
diagram focuses on the first locator, L1
end title
entity C
entity L1
entity L2

note right of L2
L1 and L2 have been killed.  C
detects loss and becomes coordinator.
L1 and L2 are somehow restarted
simultaneously.  This diagram tracks
L1's restart activity
end note

L1 -> L1 : recoverFromFile
note right
on startup locators recover their
last membership view from .dat file
and from other locators
end note

L1 -> L2 : recoverFromOthers
L2 -> L1 : old view
note right of L1
L1 will try to join with the old
coordinator and then fall into
findCoordinatorFromView
end note

L1 -> C : FindCoordinator()
C -> L1 : response(coord=C)

L1 -> C : JoinRequest
C -> L1 : New View(coord=C)
L1 -> L1 : continue startup
L1 -> C : New View(coord=L1)
note right
Upon receiving the new view with coord=C
L1 will determine that it should become
coordinator and create a new view
end note


 

Geode prefers to have locators be the membership coordinator when network partition detection is enabled, or when peer-to-peer authentication is enabled.  Other members will take on the role if there are no locators in the membership view but they will be deposed once a new locator joins the distributed system.

But what happens if two locators are started at the same time?  Which one becomes the new coordinator?  The diagram above shows this interaction from the perspective of one of the locators.  The diagram below shows this same interaction from the perspective of the other locator.

In the initial implementation of GMSJoinLeave the current coordinator recognized that a locator was attempting to join and responded with a "become coordinator" message.  This lead to a lot of complications when a second locator was also trying to join so we decided to remove the whole "become coordinator" notion and have the current coordinator accept and process the JoinRequest.  This allows the locator to join and then detect that it should be the coordinator.