Bug Reference

https://issues.apache.org/jira/browse/CLOUDSTACK-657

Branch

master, 4.2.0

Introduction

VMware Distributed Switch is an aggregation of per-host virtual switches presented and controlled as a single distributed switch through vCenter Server at the Datacenter level. vDS abstracts configuration of individual virtual switches and enables centralized provisioning, administration, and monitoring.
vDS is integral component of vCenter. Hence the native vDS support makes sense for wider and larger deployments of Cloudstack over vSphere.

Each Standard vSwitch represents an independent point of configuration that needs to be managed and monitored. The management of virtual networks required by instances in the cloud is tedious when virtual networks have to span across large number of hosts. Using distributed vSwitch (vDS) simplifies the configuration and monitoring.

Being standalone implementations, standard vSwitches do not provide any support for virtual machine mobility. So there needed a component to ensure that the network configurations on the source and the destination virtual switch are consistent and will allow the VM to operate without breaking connectivity or network policies. Particularly during migration of VM across hosts, the sync up among peers need to be taken care. However in case of distributed vSwitch during VMotion, the vCenter server, would update the vSwitch modules on the hosts in cluster accordingly.

Purpose

This is functional specification of feature "CloudStack integration with VMware dvSwitch" which has Jira ID CS-657

References

Document History

Author

Description

Date

Sateesh Chodapuneedi

Initial Revision

12/31/2012

Glossary

  • dvSwitch / vDS - VMware vNetwork Distributed Virtual Switch.
  • vSwitch - VMware vNetwork Standard Virtual Switch.
  • dvPort - Distributed Virtual Port (member of dvPortGroup).
  • dvPortGroup - Distributed Virtual Port Group

Feature Specifications

This feature enables VMware distributed vSwitch in CloudStack to configure and manage virtual networks over dvSwitch instances in datacenter of the managed cluster.

  1. CloudStack does following,
    1. Create dvPortGroup over designated dvSwitch
    2. Modify dvPortGroup over designated dvSwitch
    3. Delete dvPortGroup over designated dvSwitch
  2. CloudStack doesn't do following,
    1. Create dvSwitch
    2. Add host to dvSwitch
    3. Dynamic migration of virtual adapters of existing VMs across differnt types of virtual switches in scenarios in which co-existance of multiple types of virtual switches (Cisco's Nexus 1000v or VMware standard vSwitch and dvSwitch) is possible. Instead this is left to administrator to decide.
    4. Configuration of PVLAN
    5. Configuration dvPort mirror
    6. Configuration of user defined network resource pools for Network I/O Control (NIOC)
  3. Quality risks (test guidelines)
    1. functional
      1. Live migration of VM
      2. Deployment of virtual router
      3. Deployment of VM
    2. non functional: performance, scalability, stability, overload scenarios, etc
      1. Large number of VMs and isolated networks need to be tested for performance specific results.
        negative usage scenarios - NA
  4. what are the audit events 
    1. All virtul network orchestration events
    2. VM migration events
  5. graceful failure and recovery scenarios
  6. possible fallback or work around route if feature does not work as expected, if those workarounds do exist ofcourse.
    1. If some guest network doesn't work correctly or if CloudStack fails to create guest network required by a VM then administrator can (re)configure dvPortGroup in relation to respective network in CloudStack.
    2. If guest network instantiation fails to due to lack of network resources, then administrator is expected to ensure that more resources/capacity employed. E.g. If number of dvPorts in a dvPortGroup get exhausted, that results in a situation that no more VM can join that network. Hence administrator can go ahead and re-configure dvPortGroup to increase the number of dvPorts to accommodate more VMs in that guest network. This is applicable to vSphere 4.1 only. Because from vSphere 5.0 onwards, there is auto expand support, so that automatic provisioning of dvPorts takes place as required.
    3. If many dvPortGroups are created and cleanup doesn't happen as expected, then administrator needs to check for unused dvPortGroups on dvSwitch and do clean up using vCenter UI.
  7. if feature depends on other run-time environment related requirements, provide sanity check list for support people to run
    1. Basic sanity testing over dvSwitch can be done. It might be ping operation from one VM to another VM in the same network. Make sure VLAN configuration is done in that network and verify isolation is achived or not.
    2. Verify if traffic shaping policy configured is applied and working.
  8. Explain configuration characteristics:
    1. Configuration parameters or files introduced/changed
      1. New configuration parameter - "vmware.use.dvswitch" of type Boolean. Possible values are "true" or "false". Default value is "false". This parameter acts as umbrella parameter for all types of distributed virtual switches. If this parameter is true, unless vmware.use.nexus.dvswitch parameter is true by default VMware distributed virtual switch would be used in the cloudstack deployment. If not specified the name of dvSwith is dvSwitch0 in case of VMware dvSwitch and name of ethernet port profile is epp0 in case of Nexus 1000v dvSwitch.
      2. New configuration parameter - "vmware.ports.per.dvportgroup" of type integer. Default value is 256. Each dvPortGroup created by CloudStack would have this number of dvports.
  9. Branding parameters or files introduced/changed - NA
  10. Highlight parameters for performance tweaking - NA
  11. Deployment requirements (fresh install vs. upgrade) if any
    1. VMware dvSwitch must be already created/configured in the vCenter datacenter deployment.
    2. All the host/clusterresources should be added to dvSwitch before adding the cluster to CloudStack's pod cluster.
  12. Interoperability and compatibility requirements:
    1. Hypervisors - VMware vSphere 4.1 or later
  13. List localization and internationalization specifications 
    1. UI changes in "Add Cluster" wizard. See the section "UI Flow".
  14. Explain the impact and possible upgrade/migration solution introduced by the feature 
  15. Explain performance & scalability implications when feature is used from small scale to large scale
    1. In case of vSphere 4.1 dvPortGroup need to be created with specific number of dvPorts. In large scale deployment optimum use of dvPorts may not be possible due to this pre-allocation. In case of vSphere 5.0 the autoexpand feature helps in auto increment of number of dvPorts.
    2. Network switches (including the vSwitch in ESXi host) keep a distinct forwarding table for each VLAN; this could lead to an increased overhead in packet forwarding when a considerable number of isolated networks, each one with a significant number of virtual machines, is configured in a data center.
  16. Explain marketing specifications
    1. Supporting VMware dvSwitch entitle CloudStack
    2. For better monitoring and simpler administration of the virtual network infrastructure in Cloud.
    3. Configuration and management of several vSwitches across large deployments in tedious.
    4. Seamless network vMotion support
    5. Better traffic shaping and efficient network bandwidth utilization is possible.
  17. Explain levels or types of users communities of this feature (e.g. admin, user, etc)
    1. admin - Administrators would be target audience for this feature as this is at infrastructure level.

Use cases

  1. There is a datacenter running vSphere clusters which are using dvSwitches for virtual networking. Migrate those servers into CloudStack cloud.CloudStack should be able to manage virtual networks over dvSwitches seamlessly.
  2. Virtual network orchestration during VM lifecycle operations in cloud should use the dvSwitch designated for specified traffic. This includes configuration/re-configuration of distributed virtual port groups associated with the VM over the designated dvSwitch.
  3. Live migration of VM within cluster. The traffic shaping policies and port statistics should be intact even after migration to another host within that cluster.
  4. Upgrade/Downgrade Network offering of an account. In this case CloudStack should reconfigure all networks (associated with that network offering) of that account.

Supportability characteristics

Logging

All virtual network orchestration activities involving dvSwitch would be logged at different log levels in management server log file.

  • INFO (all the successful operations)
  • ERROR (all exceptions/failures)
  • DEBUG (all other checks)

Debugging/Monitoring

In addition to looking at the management server logs, administrators can look up the following,

  • vCenter logs for analysis.
  • Warnings and Alerts associated with specific cluster in vCenter
  • dvPort status (to see if port is configured correctly and is active or not) displayed in network configuration screen of vSphere native/web client UI.

Architecture and Design description

CloudStack reads physical traffic labels to understand the designated virtual switches to use for virtual network orchestration. Also virtual switch type and name are added to existing list of custom properties of cluster to enable a cluster level override option for virtual switch type and name. A cluster level option would precede over virtual switch type or name specified in zone level physical traffic label.

  1. Highlight architectural patterns being used (queues, async/sync, state machines, etc) - N/A
  2. Traffic label format is ["Name of vSwitch/dvSwitch/EthernetPortProfile"[,"VLAN ID"[,"vSwitch Type"]]]
    1. Description
    2. All the 3 fields are optional
    3. Default values for the 3 fields are as below
      1. 1st field - Represents the name of virtual/distributed virtual switch at vCenter. The default value assumed would depend upon the type of virtual switch. Defaults values are as follows.
        1. vSwitch0 if type of virtual switch is "VMware vNetwork Standard virtual switch"
        2. dvSwitch0 if type of virtual switch is "VMware vNetwork distributed virtual switch"
        3. epp0 if type of virtual switch is "Cisco Nexus 1000v distributed virtual switch"
      2. 2nd field - VLAN ID to be used for this traffic where ever applicable. This field would be used for only public traffic as of now. In case of guest traffic this field would be ignored and could be left empty for guest traffic.
        1. By default empty string would be assumed which translates to untagged VLAN for that specific traffic type.
      3. 3rd field - Type of virtual switch specified as string. Possible valid values are vmwaredvs, vmwaresvs, nexusdvs. Each translates as follows.
        1. "vmwaresvs" represents "VMware vNetwork Standard virtual switch"
        2. "vmwaredvs" represents "VMware vNetwork distributed virtual switch"
        3. "nexusdvs" represents "Cisco Nexus 1000v distributed virtual switch"
        4. If nothing is specified (left empty) that means zone level default virtual switch (based on value of global parameters) would be assumed. Following are the global configuration parameters.
          1. vmware.use.dvswitch - This should be true to enable any kind (VMware DVS / Cisco Nexus 1000v DVS) of distributed virtual switch in cloudstack deployment. If this is false that means default virtual switch in that cloudstack deployment is "standard virtual switch" only.
          2. vmware.use.nexus.vswitch - This parameter would be ignored unless "vmware.use.dvswitch" is true. Set this to "true" to enable Cisco Nexus 1000v distributed virtual switch in a cloudstack deployment.
    4. As per above mentioned format, furnishing few possible values for networkLabel,
      1. "" (empty string)
      2. dvSwitch0
      3. dvSwitch0,200
      4. dvSwitch0,300,vmwaredvs
      5. myEthernetPortProfile,,nexusdvs
      6. dvSwitch0,,vmwaredvs
  1. Talk about main algorithms used
    1. Isolated network configuration using VLAN over dvSwitch. CloudStack manages dvPortGroup configured with a designated VLAN ID.
    2. The scenarios to be covered are,
      1. Adding host/compute resource to podcluster. Create necessary cloudstack managed virtual networks on designated dvSwitch.
      2. Livemigration
      3. All VM life cycle operations which might need instantiation of guest network etc.
      4. Network operations like network creation, network destroy etc.
  2. Port binding would be static binding.
  3. Performance implications: what are the improvements or risks introduced to capacity, response time, resources usage and other relevant KPIs•In vSphere 5.0, we will use "AutoExpand" support to configure dvPorts per dvPortGroup. This ensures that we don't pre-allocate unnecessary dvPorts so that they
  4. Packages that encapsulates the code changes,
    1. core package (VmwareManager, VmwareResource)
    2. server package
    3. vmware-base package mo and util packages

Web Services APIs

Changes to existing web services APIs - AddClusterCmd
Adding following optional parameters.

  1. 'guestvswitchtype' which can have values 'vmwaresvs' or 'vmwaredvs' or 'nexusdvs'.
  2. 'publicvswitchtype' which can have values 'vmwaresvs' or 'vmwaredvs' or 'nexusdvs'.
  3. 'guestvswitchname' - Name of vSwitch/dvSwitch to be used for guest traffic.
  4. 'publicvswitchname' - Name of vSwitch/dvSwitch to be used for public traffic.

New APIs introduced - N/A

UI flow

Add cluster wizard

If hypervisor is VMware and global configuration parameter "vmware.use.dvswitch" is set to true, then display following list boxes and text boxes. Display of these list boxes and text boxes should be contrained by check boxes. Unless user activates check box corresponding to a particular traffic (guest or public) the list box and text box associated with it should be disabled.

  1. Check box
    1. Label - Choose to override zone wide traffic label for guest traffic for this cluster.
    2. Function - This check box controls below list box and text box.
  2. List Box
    1. Label - "Guest Virtual Switch Type"
    2. Default option - if "vmware.use.nexus.dvswitch" is true then use "Cisco Nexus 1000v Distributed Virtual Switch". Otherwise use "VMware vNetwork Distributed Virtual Switch"
    3. List Box options are,
      1. "VMware vNetwork Standard Virtual Switch"
        Action to perform if this option is selected:-
        Add a parameter to parameter list of AddClusterCmd API call.
        Parameter name: "guestvswitchtype"
        Parameter value: "vmwaresvs"
      2. "VMware vNetwork Distributed Virtual Switch"
        Action to perform if this option is selected:-
        Add a parameter to parameter list of AddClusterCmd API call.
        Parameter name: "guestvswitchtype"
        Parameter value: "vmwaredvs"
      3. "Cisco Nexus 1000v Distributed Virtual Switch"
        Action to perform if this option is selected:-
        Add a parameter to parameter list of AddClusterCmd API call.
        Parameter name: "guestvswitchtype"
        Parameter value: "nexusdvs"
  3. Text Box
    1. Label - "Name of Virtual Switch for guest traffic"
    2. Action to perform if the text box is not empty:-
      Add a parameter to parameter list of AddClusterCmd API call.
      Parameter name: "guestvswitchname"
      Parameter value: textbox content
  4. Check box
    1. Label - Choose to override zone wide traffic label for public traffic for this cluster.
    2. Function - This check box controls below list box and text box.
  5. List Box
    1. Label - "Public Virtual Switch Type"
    2. Default option - if "vmware.use.nexus.dvswitch" is true then use "Cisco Nexus 1000v Distributed Virtual Switch". Otherwise use "VMware vNetwork Distributed Virtual Switch"
    3. List Box options are,
      1. "VMware vNetwork Standard Virtual Switch"
        Action to perform if this option is selected:-
        Add a parameter to parameter list of AddClusterCmd API call.
        Parameter name: "publicvswitchtype"
        Parameter value: "vmwaresvs"
      2. "VMware vNetwork Distributed Virtual Switch"
        Action to perform if this option is selected:-
        Add a parameter to parameter list of AddClusterCmd API call.
        Parameter name: "publicvswitchtype"
        Parameter value: "vmwaredvs"
      3. "Cisco Nexus 1000v Distributed Virtual Switch"
        Action to perform if this option is selected:-
        Add a parameter to parameter list of AddClusterCmd API call.
        Parameter name: "publicvswitchtype"
        Parameter value: "nexusdvs"
  6. Text Box
    1. Label - "Name of Virtual Switch for public traffic"
    2. Action to perform if the text box is not empty:-
      Add a parameter to parameter list of AddClusterCmd API call.
      Parameter name: "publicvswitchname"
      Parameter value: textbox content

Open Issues

  1. What is level of support for migration Scenarios? Does an existing CloudStack deployment over standard vSwitch need to be migrated to dvSwitch?
  2. Management network over virtual switches other than standard vSwitch. Does CloudStack has to support management/private network over dvSwitch or just support guest and public network traffic?

Appendix

Appendix A:

Appendix B:

  • No labels