Introduction

This feature allows the direct agents to support multiple management servers. With this feature, a load balancer may no longer be necessary. The CloudStack administrator is responsible for setting a list of management servers and an algorithm in which to sort them, to the management server, using global configurations. The management server is responsible for applying algorithm to the management server list and propagate it to the agents.

Purpose

Allow a CloudStack administrator to set a list of management servers for direct agents to connect to. In this way, directs agents should connect to the first element on the list as the master node, but in case of disconnection will iterate through the list in order (secondary nodes).

Bug Reference

Unable to render Jira issues macro, execution error.

Feature Specification

The new CA framework introduced basic support for comma-separated

list of management servers for agent, which makes an external LB
unnecessary.

This extends that feature to implement LB sorting algorithms that
sorts the management server list before they are sent to the agents.
This adds a central intelligence in the management server and adds
additional enhancements to Agent class to be algorithm aware and
have a background mechanism to check/fallback to preferred management
server (assumed as the first in the list). This is support for any
indirect agent such as the KVM, CPVM and SSVM agent, and would
provide support for management server host migration during upgrade
(when instead of in-place, new hosts are used to setup new mgmt server).

This FR introduces two new global settings:

- `indirect.agent.lb.algorithm`: The algorithm for the indirect agent LB.
- `indirect.agent.lb.check.interval`: The preferred host check interval
for the agent's background task that checks and switches to agent's
preferred host.

The indirect.agent.lb.algorithm supports following algorithm options:

- static: use the list as provided.
- roundrobin: evenly spreads hosts across management servers based on
host's id.
- shuffle: (pseudo) randomly sorts the list (not recommended for production).

Any changes to the global settings - `indirect.agent.lb.algorithm` and
`host` does not require restarting of the management server(s) and the
agents. A message bus based system dynamically reacts to change in these
global settings and propagates them to all connected agents.

Comma-separated management server list is propagated to agents on
following cases:
- Addition of a host (including ssvm, cpvm systevms).
- Connection or reconnection by the agents to a management server.
- After admin changes the 'host' and/or the
'indirect.agent.lb.algorithm' global settings.

On the agent side, the 'host' setting is saved in its properties file as:
`host=<comma separated addresses>@<algorithm name>`.

First the agent connects to the management server and sends its current
management server list, which is compared by the management server and
in case of failure a new/update list is sent for the agent to persist.

From the agent's perspective, the first address in the propagated list
will be considered the preferred host. A new background task can be
activated by configuring the `indirect.agent.lb.check.interval` which is
a cluster level global setting from CloudStack and admins can also
override this by configuring the 'host.lb.check.interval' in the
`agent.properties` file.

Every time agent gets a ms-host list and the algorithm, the host specific
background check interval is also sent and it dynamically reconfigures
the background task without need to restart agents.

Note: The 'static' and 'roundrobin' algorithms, strictly checks for the
order as expected by them, however, the 'shuffle' algorithm just checks
for content and not the order of the comma separate ms host addresses.

Example

Supposing an environment in which there are 3 management servers: A, B and C and 3 KVM agents.

Setting 'host' = 'A,B,C', agents will receive lists depending on 'direct.agent.lb' value:

  • 'static': Each agent will receive the list: 'A,B,C'
  • 'roundrobin': First agent receives: 'A,B,C', second agent receives: 'B,C,A', third agent receives: 'C,B,A'
  • 'shuffle': Each agent will receive a list in random order.