User Dispersing DeploymentPlanner

Originally published on the Cloud.com wiki. Added by David Nalley. Last modified October, 2011.

Introduction

The User Dispersing Deployment Planner introduces a new planner to the existing deployment planner set. Goal of this planner is to make the best effort to deploy VMs belonging to the same account into different clusters or pods. The highlights of the feature include:

User Dispersing Planner
Admin should have the ability to select the level (cluster or pod) at which the planner should apply the deployment heuristic.
Planner selection as part of ServiceOffering.

Functionality Description

This section describes each of the functionality introduced in this feature on a high level.

UserDispersingPlanner

This planner provides the ability to pick resources (cluster or pod) such that VMs belonging to an account are dispersed as much as possible. It is desired in many cases that the VMs belonging to accounts/users should not get deployed to a particular pod or cluster. The reason for this is if that one cluster or pod goes down due to some issue, it affects that particular account completely. Hence this planner is needed.

Allocation logic overview:

Currently CloudStack provides a FirstFitPlanner that picks clusters or pods in the order of available aggregate capacity. So after a VM is being deployed on one cluster, this planner picks the next cluster or pod with better aggregate capacity
The user dispersing logic will be applied on top of the existing planner. Thus it will list clusters or pods per aggregate capacity by calling FirstFitPlanner. Then from this list choose clusters or pods in ascending order of the number of VMs they host for the given account. Thus this aims to balancing of the VMs across the clusters or pods.
Once a cluster is chosen, planner calls HostAllocator and StoragePoolAllocator to look for suitable hosts and storagepools respectively, within the cluster. User-dispersion is applied by the Allocators as well in following manner:
- HostAllocator: HostAllocator should provide a list of suitable hosts in ascending order of number of VMs they already have for the given account. So in a cluster, we choose the host with least number of VMs for the account.
- StoragePoolAllocator: StoragePoolAllocator should provide a list of suitable pools in ascending order of number of volumes they already have for the given account. So in a cluster, we choose the storage pool with least number of volumes for the account.
- Note: Once a cluster is chosen, we will always try to choose a host and storage pool within that cluster catering to less number of VMs for the account. So if there is no host having zero number of VMs for the account, we pick the one that has lesser number of VMs than the others in the same cluster. We won’t fall back to choose another cluster just to choose a host with zero number of VMs. Same is applicable for storage pools.
All other conditions that are implemented by the FirstFitPlanner are also applicable to this planner since this planner extends the existing deployment logic.

Some of these conditions are:

If a VM is being restarted then planner will try to use the last host to deploy it again if the last_host_id is specified. In this case the dispersing or cluster availability logic is not applied. In case deploy VM fails on the last host, then the planner goes back to searching resources from generic pool and applies the regular constraints.
If a VM’s ROOT volume is in ready state, planner has to choose the same cluster to deploy the VM since storage pool is ready.
If deployment plan (cluster or pod) is specified to the planner, the constraints are not applied and planner moves forward to pick remaining resources under that cluster or pod.

Ability to select pod or cluster to apply planner heuristics

Currently FirstFitPlanner looks directly at the clusters while listing by aggregate capacity. This ensures that the VM load is balanced across clusters. The clusters across pods get considered and VMs may get dispersed across pods or may not.

However admins need the ability to ensure that the VMs get dispersed at pod level also since there are networking/business implications at pod level. This needs us to change the FirstFitPlanner to be able to list pods per aggregate capacity instead of clusters and choose the pod that has better available capacity first.

We plan to provide a configuration parameter to enable admin to select whether the planner should look at the pod level or cluster level.

If pod level is selected, planner should apply the capacity logic to the list of pods under a zone. The user dispersing planner should choose those pods first which have less number of VMs for this account.

Under each pod, the capacity and user dispersing heuristic should be reapplied to all the clusters as well. Thus the dispersion should be done at pods and then at clusters.

Planner selection as part of ServiceOffering

Since we will have multiple deployment planners with varied abilities of selecting a deployment destination, we need to provide an admin the ability to select a planner as per requirements.

To enable this, we need to provide planner selection as part of the serviceoffering. And each deploymentPlanner should check the serviceOffering to see if it should handle the VM deployment or not.

Create ServiceOffering Changes

When a serviceOffering is created, provide a drop-down list of deployment planners. Admin selects one planner and that is picked by CloudStack when VM is being deployed using that offering.
Planner selection cannot be updated as part of update serviceOffering. Reason being if there are VMs already deployed using the original selection, CloudStack does not have capability to re-allocate them as per updated definition of the serviceOffering.

API Changes

CreateServiceOfferingCmd

This API creates a new serviceOffering.

Changes to this API:

- New String Parameter added: deploymentPlanner

Possible list of values:

Unknown macro: {FirstFitPlanner (default), UserDispersingPlanner}

- ServiceOfferingResponse: deploymentPlanner

Response object includes the planner chosen for this serviceOffering. It is a String.

UI Changes

There are UI changes to enable an admin to select a planner while creating a ServiceOffering.

Following changes are needed:

- Create ServiceOffering should include a field with tag ‘Deployment Planner to use’

- Value should be a drop down list that currently has

Unknown macro: {FirstFitPlanner (default), UserDispersingPlanner}

- By default, value should be ‘FirstFitPlanner‘

- On successfully creating the offering, the response page should show the deploymentPlanner value back to the user.

DB Changes

`cloud`.`service_offering`

A new Column ‘deployment_planner’ will be added to this table. Upgrade script will have this added:

ALTER TABLE `cloud`.`service_offering` ADD COLUMN `deployment_planner` varchar(255) NOT NULL DEFAULT 'FirstFitPlanner' COMMENT 'Class name of the deployment planner to use';

`cloud`.`configuration`

We need to add a new Configuration parameter to enable admin to choose whether the planner heuristics should be applied at pod level or cluster level.

INSERT IGNORE INTO configuration VALUES ('Advanced', 'DEFAULT', 'management-server', 'apply.planner.heuristics.to', 'cluster', 'Level at which deployment planner should apply heuristics to (cluster or pod)');

We need to add weights to the planner heuristics.INSERT IGNORE INTO configuration VALUES ('Advanced', 'DEFAULT', 'management-server', ‘user.vm.dispersion.weight’, '1', 'Weight for user dispersion heuristic. Weight for capacity heuristic will be (1 – weight of user dispersion)');

Design

FirstFitPlanner

This planner will need some changes in order to implement this feature.

Planner need to take into account the admin’s selection of whether to apply the capacity heuristics at pod level or cluster level.
Add new functionality to list pods by aggregate capacity and plan using the list of pods instead of clusters.
Functionality to list pods or clusters by aggregate capacity needs to be refactored so that it can be reused by other planners

UserDispersingPlanner

This planner will extend the existing FirstFitPlanner to figure out the list of clusters/pods ordered by available aggregate capacity.

It will apply the user dispersing heuristic on top of this list.

The algorithm to figure out a deployment destination from a given cluster list will be inherited from FirstFitPlanner.
Each planner should implement the canHandle() method to check if it is the correct planner to handle deployment by looking at the serviceOffering OR the VirtualMachineManager can call the required planner directly.

The logic to reorder the list of clusters/pods to ensure user dispersing:

From the vm_instance table, find the list of cluster_ids or pod_ids of all VMs having the given ‘account_id’ and in ‘running’ state in ascending order of count of VMs.
From the list of clusters/pods having capacity, move the ones NOT found in the above resultset in front of the list so that those clusters/pods are considered before the rest that have atleast one VM for this account.
If none of the clusters/pods moved ahead are able to deploy the VM, we will failover to the remaining clusters/pods having capacity, in ascending order of the count of VMs they host for the account.
This will ensure that clusters with no VM for the account are tried first and then the remaining having VMs in ascending order.

Example: Suppose we have pods (P1, P2) and clusters (C1…C3) with hosts (H1…H5) and pools (P1…P4) as follows:

Then this is how we choose the resources:

List Pods in order of available aggregate capacity. Say list is:

Unknown macro: {P2, P1}
Get number of VMs of this account for each pod and list in ascending order. Say it is:

Unknown macro: {P1 (1 Vms), P2 (2 Vms)}
Choose the pods in this order. So P1 first and then P2.
Within each pod list clusters based on aggregate capacity. Say for P1:

Unknown macro: {C2}
List by number of VMs for this account per cluster in ascending order. Say:

Unknown macro: {C1(0 Vms), C2(1 Vms)}
We cannot choose cluster C1 now, since it has no capacity. So choose clusters in this order but having capacity.
Finally from C2, choose a host with less number of VMs for this account and a pool with less number of volumes for this account.

o Suppose admin selects to apply the heuristics at cluster level directly instead of pod, we start from listing clusters.

o Assigning weights to capacity and number of VMs:

- As seen above we have two parameters to order the pods or clusters- capacity and number of VMs.

- We plan to provide a global config variable to set weights for these two parameters so that we don’t ignore capacity completely over the number of VMs while reordering the lists.

Thus admin can set something like:

user.vm.dispersion.weight = 0.75

- This means that weight for capacity is 0.25

- We will apply the weights to the capacity and number of VMs to generate a number on which we reorder the clusters or pods.

By default user VM dispersion will have 100% weight. Thus value of user.vm.dispersion.weight = 1 and weight for capacity will be 0.

- Example:

Suppose this is the cluster list in order of capacity:

Cluster	x = % Full Capacity	x * weight(0.25)
C1	0.35	0.0875
C2	0.45	0.1125
C3	0.65	0.1625

And this is the list in order of number of VMs for the given account:

Cluster	Number of VMs for this account	Normalized value y = number/total number of VMs existing for this account (10)	y * weight(0.75)
C2	1	0.1	0.075
C3	2	0.2	0.15
C1	4	0.4	0.3

Calculate total weight per cluster as, total weight = (x*0.25 + y * 0.75)

Cluster	Total weight
C1	0.3875
C2	0.1875
C3	0.3125

Then reorder the list by total weight. The final ordered list will be:

Unknown macro: {C2, C3, C1}

Space shortcuts

Child pages

Introduction

Functionality Description

UserDispersingPlanner

Ability to select pod or cluster to apply planner heuristics

Planner selection as part of ServiceOffering

Create ServiceOffering Changes

API Changes

CreateServiceOfferingCmd

UI Changes

DB Changes

`cloud`.`service_offering`

`cloud`.`configuration`

Design

FirstFitPlanner

UserDispersingPlanner

The logic to reorder the list of clusters/pods to ensure user dispersing: