Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyCLOUDSTACK-10132

Feature Specification

A new global configuration is added: 'direct.agent.lb.algorithm'. Possible values of this global configuration are: 'static', 'roundrobin' and 'shuffle'. 

This feature makes use of two global configurations:

  • 'host': A list of comma separated management servers is accepted. Example value: '192.168.10.10,192.168.10.11'
  • 'direct.agent.lb.algorithm': Possible values: 'static', 'roundrobin' or 'shuffle'

Values from both global configurations are read on management server startup by new utility and kept in memory.

An existing command-answer pattern that handles initial agent connection and startup will be modified to propagate the a comma-separated list of management servers. This list will be provided by the utility and will be determined by the value of 'direct.agent.lb.algorithm' in this way:

  • 'static': No modification is made to the original 'host' list, sent to agents as it is
  • 'roundrobin': Rotate items one by one once the list is read
  • 'shuffle': Randomly sort the list to send to an agent

The new CA framework introduced basic support for comma-separated

list of management servers for agent, which makes an external LB
unnecessary.

This extends that feature to implement LB sorting algorithms that
sorts the management server list before they are sent to the agents.
This adds a central intelligence in the management server and adds
additional enhancements to Agent class to be algorithm aware and
have a background mechanism to check/fallback to preferred management
server (assumed as the first in the list). This is support for any
indirect agent such as the KVM, CPVM and SSVM agent, and would
provide support for management server host migration during upgrade
(when instead of in-place, new hosts are used to setup new mgmt server).

This FR introduces two new global settings:

- `indirect.agent.lb.algorithm`: The algorithm for the indirect agent LB.
- `indirect.agent.lb.check.interval`: The preferred host check interval
for the agent's background task that checks and switches to agent's
preferred host.

The indirect.agent.lb.algorithm supports following algorithm options:

- static: use the list as provided.
- roundrobin: evenly spreads hosts across management servers based on
host's id.
- shuffle: (pseudo) randomly sorts the list (not recommended for production).

Any changes to the global settings - `indirect.agent.lb.algorithm` and
`host` does not require restarting of the management server(s) and the
agents. A message bus based system dynamically reacts to change in these
global settings and propagates them to all connected agents.

Comma-separated management server list is propagated to agents on
following cases:
- Addition of a host (including ssvm, cpvm systevms).
- Connection or reconnection by the agents to a management server.
- After admin changes the 'host' and/or the
'indirect.agent.lb.algorithm' global settings.

On the agent side, the 'host' setting is saved in its properties file as:
`host=<comma separated addresses>@<algorithm name>`.

First the agent connects to the management server and sends its current
management server list, which is compared by the management server and
in case of failure a new/update list is sent for the agent to persist.

From the agent's perspective, the first address in the propagated list
will be considered the preferred host. A new background task can be
activated by configuring the `indirect.agent.lb.check.interval` which is
a cluster level global setting from CloudStack and admins can also
override this by configuring the 'host.lb.check.interval' in the
`agent.properties` file.

Every time agent gets a ms-host list and the algorithm, the host specific
background check interval is also sent and it dynamically reconfigures
the background task without need to restart agents.

Note: The 'static' and 'roundrobin' algorithms, strictly checks for the
order as expected by them, however, the 'shuffle' algorithm just checks
for content and not the order of the comma separate ms host addressesThe agents will be modified to save this list as the configured 'host' value in agent.properties file.

Example

Supposing an environment in which there are 3 management servers: A, B and C and 3 KVM agents.

...