Introduction

Auto Scaling allows you to scale up or scale down back-end services or application virtual machines(guest VMs) based on various conditions you define and thereby ensure optimum use of virtual resources. Conditions for triggering a scale up or scale down action can vary from a simple use case like monitoring the CPU usage of a server to a complex use case say of monitoring a combination of server's responsiveness and its cpu usage. Loadbalancers like NetScaler are well placed to monitor all aspects of a server’s health and work in unison with Cloud Orchestration products like CloudStack to initiate scale up or scale down actions. This feature is an extension to ELB support and currently supported only in NetScaler 10.0

Purpose

This is the functional specification for the LB rule based AutoScale feature of CloudStack.

Document History

S.No.	Version	Date	Author	Remarks
1	1	13 June 2012	Deepak	First Draft of FS
2	1.1	02 July 2012	Ram Ganesh	second draft version. Revised various sections
3.	1.2	10 July 2012	Vijay Venkat	Updated with list and delete commands and added account info in create commands
4.	1.3	18 July 2012	Vijay Venkat	Introduced with defaults and update APIs
5.	1.4	8 Aug 2012	Vijay Venkat	Added recommended steps and more text to update commands.

Glossary

Counter

Performance counters to be used for monitoring the health of the back-end services(guest VMs) which are being load balanced by load balancers such as NetScaler. These counters can be either SNMP based counters or the counters supported by loadbalancers. We could ship CloudStack with a set of built-in popular counters.

Condition

Conditions express criteria for triggering an autoscale action. It uses the counters defined above.

AutoScale Policy

AutoScale Policy defines a policy for taking an autoscale action by combining conditions defined above with an autoscale action such a scaleup or scaledown

AutoScale Vm Profile

AutoScale Vm Profile is a container for various settings to be used while taking scale up or scale down action. Settings could be what template id to use,what service offerings to use etc

AutoScale Vm Group

AutoScale Vm Group associates scaleup and scaledown policies with a loadbalancing rule. It also defines the size of the Autoscale'd vm group.

Feature Specifications

Auto Scaling allows you to scale up or scale down back-end services or application virtual machines(guest VMs) based on various conditions you define and thereby ensure optimum use of virtual resources.
Use cases

Add a NetScaler loadbalancer device into a zone configured either in Basic or Advanced mode.
Configure an Elastic Load balancer rule
Configure AutoScale feature on the loadbalancer rule using either AutoScale APIs or CloudStack UI

Architecture and Design description

Collector/Monitor: This module is responsible for gathering metrics from various guest VMs. Timers wakes up a collector/monitor periodically for metric collection purpose. Periodicity of timer is configurable.Monitoring framework in NetScaler is rich in protocol support varying from L3 to L7 with deep packet inspection capabilities.

Aggregator: This module aggregates the collected metric values based on various algorithms such as average,minimum,maximum etc. In NetScaler this is part of the monitoring infrastructure.

Trigger/Alarm Generator: Timer based triggers monitors the output of aggregators,evaluates autoscale conditions and generates a trigger/alarm if the condition is evaluated to true.In NetScaler this is part of powerful policy based infrastructure responsible for various expression evalutions and actions.

Trigger/Alarm Handler: Trigger/Alarm Handlers acts on the trigger/alarm and initiates a scaleup/scale down action working in unison with Cloud orchestration products such as CloudStack.This is a web service residing in the NetScaler management plane and acts as a glue between CloudStack management server and NetScaler data plane.

In the current implementation all the above 4 modules reside inside NetScaler system. The interaction between Trigger Generator and Handler is based on standard json payload over HTTP.

Following are the proposed APIs for this feature. For all the APIs defined corresponding delete and list* commands will also be provided

createCounter

Available to ROOT admin only

Request parameters

source: defines the source of the counter - SNMP | NetScaler
name: CloudStack defined name for the counter
value: OID if the source is SNMP and loadbalancer exposed counter name if source is NetScaler

New tables

counter(id,uuid,source,name,value)

deleteCounter

Available to ROOT admin only

Request parameters

id: counter id

listCounters

Available to all users

Request parameters

id: counter id
name: query based on name
source: query based on source
keyword: applied to search in name

createCondition

Request parameters

counterid: counter id
relationaloperator: relation operation - GT | GE | LT | LE | EQ
threshold: threshold value for the monitored counter
account
domainid

New tables

conditions(id,uuid,counter_id,threshold,relational_operator,account_id,domain_id)

deleteCondition

Request parameters

id: condition id

listConditions

Request parameters

id: condition id
policyid (to list the conditions of a policy)
counterid (query based on counter's participating in condition)

createAutoScaleVmProfile

Request parameters

zoneid: zone id where VMs will be deployed
templateid: CS template id
serviceofferingid: service offerings to be used while deploying a VM as part of scaleup autoscale action (basically during the deployvirtualmachine)
otherdeployparams: Any other VM deploy parameters to be used while deploying a VM as part of scale-up autoscale action. Multiple parameters can be specified using URL parameter convention
snmpcommunity: SNMP community to be used while gather counter values using SNMP. Default: "public"
snmpport: SNMP agent port. Default: 161
destroyvmgraceperiod: Time,in seconds, to wait before a VM is actually destroyed as part of scale-down autoscale action. Default : 120 seconds
autoscaleuserid: CS username to be used while issuing scale up and scale down action. Default: caller. This parameter also determines the account/domain for which the profile is created. ApiKey and SharedKey should be generated for this user, if not then the call should error out. Default: caller

New tables

autoscale_vmprofiles(id,uuid,zone_id,domain_id,account_id,service_offering_id,template_id,other_deploy_params,snmp_community,snmp_port,destroy_vm_grace_period,autoscale_user_id)

updateAutoScaleVmProfile

Request parameters

id: autoscale profile id
templateid: CS template id
snmpcommunity: SNMP community to be used while gather counter values using SNMP. Default: "public"
snmpport: SNMP agent port. Default: 161
destroyvmgraceperiod: Time,in seconds, to wait before a VM is actually destroyed as part of scale-down autoscale action. Default : 120 seconds
autoscaleuserid: CS username to be used while issuing scale up and scale down action. Default: caller. This parameter also determines the account/domain for which the profile is created. Default: caller

NOTE: An autoscale vm profile can be updated only if all the VM Groups it is associated with are in disabled state. disableAutoScaleVmGroup has to be called to all Vm Groups associated with this profile before issuing this command.

deleteAutoScaleVmProfile

Request parameters

id: autoscale profile id

listAutoScaleVmProfiles

Request parameters

id: autoscale profile id
templateid: template id
otherdeployparams

createAutoScalePolicy

Request parameters

action: actions - scaleup | scaledown
conditionids: array of conditions which need to be evaluated before an autoscale action is triggered
duration: duration,in seconds, the conditions should hold true for an autoscale action is to be invoked
quiettime: Time,in seconds. This is the cool down period after an action is initiated. It will allow the fleet to come to a stable state before any action can take place. Default:300 seconds.

New tables

autoscale_policies(id,uuid,zone_id,domain_id,account_id,duration,quiet_time,action)
autoscale_policy_condition_map(id,policy_id,condition_id)

updateAutoScalePolicy

Request parameters

id: autoscale policy id
*conditionids: one or more condition ids which need to be evaluated before an autoscale action is triggered
duration: duration,in seconds, the conditions should hold true for an autoscale action is to be invoked
quiettime: Time,in seconds. This is the cool down period after an action is initiated. It will allow the fleet to come to a stable state before any action can take place.

NOTE: An autoscale policy can be updated only if all the VM Groups it is associated with are in disabled state. disableAutoScaleVmGroup has to be called to all Vm Groups associated with this policy before issuing this command.
* The conditionids are always reset, no addition or deletion of policies is possible. Meaning, in order to add one more to the list, pass the existing and the new ones. In order to delete one, remove the one to be deleted from the existing list and pass the remaining.

deleteAutoScalePolicy

Request parameters

id: autoscale policy id

listAutoScalePolicies

Request parameters

id: autoscale policy id
action: scale up or scale down.
vmgroupid: the autoscale policies of a vmgroup.
conditionid: any of the conditions in a policy. (all policies using a condition will be returned)

createAutoScaleVmGroup

Request parameters

lbruleid: Loadbalancer rule which is to be enabled for autoscale.
minmembers: minimum number of VM instances (At the creation of the vm group, the fleet will grow till it reaches this minimum number of vm instances)
maxmembers: maximum number of VM instances (This is the maximum number of vm instances that the fleet grow to. Once it reaches this limit it wont scaleup even if the policies are hit)
vmprofileid: Autoscale Vm Profile id
interval: frequency,in seconds, in which the conditions need to be evaluated. Default : 30 seconds
scaleuppolicyids: scale up policy ids
scaledownpolicyids: scale down policy ids

New tables

autoscale_vmgroups(id,uuid,domain_id,account_id,lbrule_id,min_members,max_members,member_port, interval, profile_id,state)
autoscale_vmgroup_policy_map(id,policy_id,vmgroup_id)

updateAutoScaleVmGroup

Request parameters

id: autoscale vm group id
minMembers: minimum number of VM instances
maxMembers: maximum number of VM instances
interval: frequency,in seconds, in which the conditions need to be evaluated. Default : 30 seconds
scaleuppolicyids: scale up policy ids
scaledownpolicyids: scale down policy ids

NOTE: An autoscale vm group can be updated only if it is in disabled state. disableAutoScaleVmGroup has to be called to first before issuing this command.

deleteAutoScaleVmGroup

Request parameters

id: autoscale vm group id

listAutoScaleVmGroups

Request parameters

id: autoscale vm group id
lbruleid: to query vmgroup based on a loadbalaner
vmprofileid: to query vmgroups based on a vm profile
policyid: to query vmgroups based on a policy
zoneid: to query vmgroups

disableAutoScaleVmGroup

Request parameters

id: autoscale vm group id

This API is used to disable autoscaling on a AutoScaled Vm Group. By default on creation, autoscaling is enabled on an Autoscale Vm Group. A user could disable AutoScaling in a Vm Group for 2 reasons.
1. A need to update the autoscale config. An autoscale config here means updation of policy parameters, addition/deletion of new conditions in a scaleup or scale down policy or updation of autoscale vm profile parameters or updation of vm group parameter itself.
2. For performing a maintenance operation on his fleet. This will enable the user to bring down/up any members of his fleet, during which autoscaling should be disabled. Meaning, no operation like scale up / down would happen.

enableAutoScaleVmGroup

Request parameters

id: autoscale vm group id

This command is used to enable autoscaling on a AutoScale Vm Group. By default autoscaling is enabled on a autoscaled Vm Group during creation. A user could have disabled AutoScaling in a Vm Group for several reasons, this API is called to enable the AutoScale Vm Group back to autoscaling. Once this is done the fleet with scale up and down depending on the policies configured.

Event Types

COUNTER.CREATE
COUNTER.DELETE
CONDITION.CREATE
CONDITION.DELETE
AUTOSCALEPOLICY.CREATE
AUTOSCALEPOLICY.DELETE
AUTOSCALEPOLICY.UPDATE
AUTOSCALEVMPROFILE.CREATE
AUTOSCALEVMPROFILE.UPDATE
AUTOSCALEVMPROFILE.DELETE

AUTOSCALEVMGROUP.CREATE
AUTOSCALEVMGROUP.DELETE
AUTOSCALEVMGROUP.UPDATE

AUTOSCALEVMGROUP.ENABLE

AUTOSCALEVMGROUP.DISABLE

Operations not supported

No implicit delete is allowed. Meaning, if counter is used in a condition the user cannot delete a condition and such restriction applies to all entities.
Only autoscaled VMs(VMs provisioned as part of loadbalancer devices such as NetScaler initiating a provision call) can be assigned to an AutoScale enabled LB rule. What this means is, autoscaled VMs and explicitly bound VMs cannot co-exist for a loadbalancer.

Quality Risks

Overload scenarios: Specifying a large value for maxMembers parameter will result in large number of VM instances to be provisioned and all the more easily if the autoscale conditions are kept at a low watermark
*corner cases and boundary conditions: *After issuing a provision call to CloudStack and before provision call is complete if NetScaler reboots/shutdowns for some reason then the provisioned VM will not be part of an LB rule though the intent was to assign it to an LB rule. Workaround: AutoScale provisioned VMs will be named based on the LB rule name/Id so at any point of time we can reconcile the VMs to its LB rule.
*negative usage scenarios: *external* destroyVM CloudStack API call on an autoscaled VM will leave NetScaler LB configuration in an inconsistent state. Specified configuration changes in NetScaler need to be made recover from this inconsistent state

external*: CloudStack API calls made outside the context of Autoscale

Logging and Debugging

All logs will go into CloudStack management server logs
Standard NetScaler logs,ns.log,will also be updated with all autoscale action details,error conditions etc.

Run-time environment requirements

For monitoring SNMP based counters ensure the SNMP agents are installed in guest VMs and SNMP operations like GET work with configured SNMP community and port by using standard SNMP managers.
If a ApiKey and SharedSecretKey is re-generated for autoscaleuserid then the user has to call disable and then enabled for the VmGroups it is participating in.
Before setting up the management server for AutoScale, the Global setting "EndpointeUrl" has to be set to the MS's API url. If mutliple server deployment then the MS's LB ip has to be mentioned in this. Also the loadbalancer should have access to this IP to provide AutoScale support.
If there is an update in the EndpointeUrl all the AutoScale Vm Groups in the system should be disabled and enabled back to reflect the change.
The following sequence of steps is recommended before your create the AutoScale Config.
- Prepare a template.
  - Very important step!
  - When a VM is deployed using this template and it comes up, the App should be in running state.
  - If this is not the case then AutoScale will consider this VM as useless and will keep provisioning vms unconditionally till working App vms reach VmGroup.minMembers.
- Deploy the template (make sure the app comes up on the first boot and is ready to take traffic, make a note of the the time it requires to deploy the template, it can be used to specify the quiet time in autoscale policy).
- Repeat steps 1 & 2 till it is possible to create the right template. Even if you are confident with your template, it is always recommended to deploy once, because the first deploy usually takes time and you don't want that happening during autoscale.
- Create a load balancer and assign the loadbalancer VM that got deployed in Step 2.
- Delete the VM and LoadBalancer created in Step2 and Step 4.
- Proceed with AutoScale config.
- Check the events page to observer how autoscaling is happeing.

Performance considerations

minMembers and maxMembers parameters which are part of AutoScaleVMGroup should be properly configured to ensure resources are used optimally. Any mis-configuration could lead to wasteful provision of resources
autoscale conditions should be configured after due consideration to ensure there are no frequent provision/de-provision calls.

Evaluation of possible security attacks against the feature

DDoS attack on LB rule which in turn loads the guest VMs being load balanced. This could trigger artificial load scenarios and thereby trigger scale-up autoscale actions. Load balancers systems such as NetScaler have inbuilt capability to filter DDoS attacks
Sophisticated L7 layer attacks could be triggered to infuse artificial load conditions. Filters in NetScaler can be configured to weed out these misbehaving clients

Interoperability and compatibility requirements

Load balancer: NetScaler 10.0 release supports this feature.

UI flow:

A basic idea of the UI.

On clicking the configure button, a configuration window will popup allowing an admin to configure AutoScale feature .

Space shortcuts

Child pages

Autoscaling

Introduction

Purpose

Document History

Glossary

Counter

Condition

AutoScale Policy

AutoScale Vm Profile

AutoScale Vm Group

Feature Specifications

Architecture and Design description

createCounter

deleteCounter

listCounters

createCondition

deleteCondition

listConditions

createAutoScaleVmProfile

updateAutoScaleVmProfile

deleteAutoScaleVmProfile

listAutoScaleVmProfiles

createAutoScalePolicy

updateAutoScalePolicy

deleteAutoScalePolicy

listAutoScalePolicies

createAutoScaleVmGroup

updateAutoScaleVmGroup

deleteAutoScaleVmGroup

listAutoScaleVmGroups

disableAutoScaleVmGroup

enableAutoScaleVmGroup

Event Types

Operations not supported

Quality Risks

Logging and Debugging

Run-time environment requirements

Performance considerations

Evaluation of possible security attacks against the feature

Interoperability and compatibility requirements

UI flow: