Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


1) Management server M1 is up and running, management server M2 joins the cluster.

M1M2:
==============================================================================
1.1) gets information about all management servers and agents in the system
1.2) calculates average agent load
1.3) For each management server we use AgentLoadBalancerPlanners to find out how many hosts the MS can give away. Currently we have just one planner - ClusterBasedAgentLoadBalancerPlanner - it gives away hosts on per-cluster basis
1.4) after the map "ManagementServer->ListOfHostsToGiveAway" is formed, for each MS in the map M1M2:

* check if the host being processed by someone. If yes - skip this host from rebalancing.
* create an entry in op_host_transfer table (hostId, currentOwnerId, futureOwnerId, "TransferRequested" state)
* send RebalanceCommand with event "RequestAgentRebalance" to the currentOwner MS.
* if the RequestAgentRebalance fails, the corresponding entry is removed from op_host_entry table and we start processing next entry from "ManagementServer->ListOfHostsToGiveAway" map
* if the RequestAgentRebalance succeds, we start processing next entry from "ManagementServer->ListOfHostsToGiveAway" map

1.5) After the map is processed, StartRebalance is considered to be completed.

 

M2M1:
==============================================================================
* receives RebalanceRequest with event=RequestAgentRebalance from M1M2.
* does nothing with Agent Attache at this point.
* creates an entry for the host in _agentToTransferIds HashSet.

 

2) Management server M2 M1 scans agentsToTransfer queue.

...

2.1) check if the task is timeout or future owner of the host is no longer active. If one of the conditions is true, the entry is removed from agentToTransferIds and op_host_transfer map. Nothing needs to be done with the agent
as actual agent attache wasn't changed yet. If none of the conditions is true, proceed to the next step
2.2) check listener and request queue for the corresponding attache. If both queues are empty, proceed to the next step. If one of the queues is not empty, skip the agent.
2.3) Updates the state for op_host_transfer entry with the new state "TransferStarted" and start actual agent transfer.


3) Actual agent transfer


M2:
M1:
==============================================================================
3.1) Puts agent attache to run in Transfer mode. Transfer mode means that we collect all the incoming requests into Requests queue without processing them.
3.2) Update host with the status "Rebalancing"; update op_host_transfer record with status "TransferStarted"
3.3) Send command to the M1M2

M1:
M2:
==============================================================================
3.4) Connects agent and sends the result of this command to the M2M1

M2:
M1:
==============================================================================
3.5) Get result from 3.4, if its:

...