You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Introduction:

This document describes the Cpu/Ram overcommit feature.

In the current implementation the cpu overcommit is global configuration value. This needs to be changed to provide a more granular control over the overcommit parameters. Currently there is no provision for ram overcommit.

This feature implements the ram overcommit and allows the ram and cpu overcommit ratios to be specified on a per cluster basis.

Use case:

change the vm density on all the hosts in a given cluster. This can be done by specifying the cpu and ram overcommit ratios.

- Each cluster (depending on the hypervisor platform, storage or h/w configuration) can handle a different number of VMs per host/cluster - trying to normalize them can be inefficient, as the ratio has to be setup for the lowest common denominator - hence, we are providing a finer granularity for better utilization of resource, irrespective of what the placement algorithm decides

- when combined with dedicated resources, it gets better - with dedicated resources, we may have the capability to tell account A will use cluster X. If this account is paying for "gold" quality of service, perhaps, those clusters would have a ratio of 1. If they are paying for "bronze" QoS, their cluster ratio could be 2. 

Design description:

Admin can give the cpu and ram overcommit ratios at the time of creating a cluster or update the values after creating.

Cloudstack will deploy the vms based the overcommit ratios. If the overcommit ratio of a particular cluster is updated, only the vms deployed hereafter will be deployed based on the updated overcommit ratios.

supported Hypervisors.

XenServer
KVM
VMware

APIs:

AddCluster Will be modified to include the cpu and ram overcommit values in the cluster parameters.

UpdateOvercommitRatio Updates the overcommit ratios of a cluster.

API Name

API parameters

API response

Available only for root admin

Addcluster

apart form the existing parameters we are adding cpuovercommitratio and ramovercommitratio

will contain additional details of cpu and ram overcommit ratios

yes

UpdateOvercommitRatio

cpuovercommitratio, ramovercommitratio

returns the detais of the updated cluster along with the overcommit ratios

yes

*All the perameters are optional but of the two parameters in bold atleast one is required.
If value is not provided we default it to 1, meaning no overcommiting of resources by default.

DB changes:

Adding new colums to table cluster to store the cpu and ram overcommit ratios.

Upgrade scenario:

On upgrade the existing cluster table will be upadated with above new columns.

Caveats

What should the behavior be if admin changes the overcommit factor for a cluster that conflicts with the current situation. For example,lets assume Cluster X has an over commit factor of 1.5x for memory and the admin wants to change this to 1x - i.e no overcommit (or changes from 2x to 1.5x) - however, based on the "older" factor, CS might already have assigned more VMs - when the admin reduces the overcommit value 

1. if there is no conflict, there is no issue

2a. if there is a conflict (i.e. current allocation would conflict with the new value) - should we reject this change?

2b. or accept the change but not add more VMs anymore

if we decrease the factor - currently we allow doing that (say change from 2X to 1X) . Lets say If the allocation is beyond the factor already (say 1.5 X) then what it means is no future allocation will be allowed and secondly the dashboard would start showing >100% allocated which might confuse the admin (in our example it would show 150%).  The admin would also start getting alerts for capacity being already exhausted. i.e. we should accept the new value and allocate only if the system has enough capacity to deploy more VMs based on the new overcommit ratios.

But, say the allocation done till now is still within the new factor (say 0.8X is allocated currently) then allocation would still be allowed and dashboard would show 80% allocated so in this case everything seems to be correct and we should allow admin changing the factor.

Task Breakup

Discussions with community 3 days.
updating functional spec 1 days.
Coding 4 days.
testing 2 days.

Task Breakup

Discussions with community 3 days.
updating functional spec 1 days.
Coding 4 days.
testing 2 days.

  • No labels