Bug Reference

https://issues.apache.org/jira/browse/CLOUDSTACK-1456

Branch

Haven't branch out yet.

Introduction

Purpose

Implement isolation in advanced zone. Focus on shared network. The target is:
1. All the user VM cannot reach other user VM.
2. All the user VM can reach DHCP server and Gateway.

The mechanism we chose to implement this feature is Private VLAN.

References

Document History

Glossary

Feature Specifications

  • The isolated port(I-port) in private vlan concept fit prefect for our requirement. Basically we just need to make every user VM connect to I-port of the switch(vswitch or open vswitch), every dhcp server connect to P-port of the switch, then it would be enough for isolation and communication.
  • Open vswitch(used by XenServer and KVM) doesn't have PVLAN support.
  • So we need extra effort to simulate PVLAN on open vswitch(ovs) for Xen and KVM.
    • We would modify flow table, to:
    • 1. For every traffic leave user VM, tagged with secondary isolate vlan tag.
    • 2. Allow secondary isolated vlan tagged traffic reach DHCP server, by change the vlan tag to primary vlan tag.
    • 3. The gateway should know nothing about PVLAN, and the switch connect to the gateway should translate all the secondary vlan to primary vlan for communicating with gateway.

Assumptions / Pre-requisites

  • Need PVLAN supported switch(refer to http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a0080094830.shtml) to connect to host.
  • Only one switch would connect to gateway, other switches need to be connect to this switch via trunk port.
    • It's an ideal situation. Only Cisco Catalyst 4500 has pvlan promiscuous trunk mode to trunk both normal vlan and pvlan to pvlan-unaware switch.
    • For other Catalyst pvlan support switch, you need to connect switch to upper switch using at least (pvlan number + 1) cables to archive this.

Use cases

  • Once feature is enabled for certain shared network, all the user VMs in the network won't be able to access to each other. But the communicating with DHCP server and gateway remain the same.

Architecture and Design description

OVS

  • For OVS, flow table need following modifications:
    • 1. For each VM:
      • <a> Tagged isolated vlan and go through flow-table again(for DHCP server specify handling):
        • priority=50,dl_vlan=0xffff,dl_src=$vm_mac,actions=mod_vlan_vid:$sec_iso_vlan,resubmit:$trunk_port
      • <b> If there is no other process in the flow table, then output to trunk port:
        • priority=60,dl_vlan=$sec_iso_vlan,dl_src=$vm_mac,actions=output:$trunk_port
    • 2. For each host has DHCP server:
      • <a> ARP for DHCP server from other hosts:
        • priority=200,arp,dl_vlan=$sec_iso_vlan,nw_dst=$dhcp_ip,actions=strip_vlan,output:$dhcp_port
      • <b> Accept packets from outside(e.g. DNS):
        • priority=150,dl_vlan=$sec_iso_vlan,dl_dst=$dhcp_mac,actions=strip_vlan,output:$dhcp_port
      • <c> Accept DHCP request from other hosts:
        • priority=100,udp,dl_vlan=$sec_iso_vlan,nw_dst=255.255.255.255,tp_dst=67,actions=strip_vlan,output:$dhcp_port
  • The VM migration and host restart would affect the rules, need to be reprogrammed.

VMWare

VMWare has two solutions that support distributed L2 routing that transparently plumb L2 switchports and associate them with vNICs of a VM, and maintain near real time state information of the network statistics on the vNICs -

  1. VMWare vNetwork Distributed Switch (vDS)
  2. Cisco Nexus 1000v (N1KV)

Both are L2 soft switches that have a management plane and a data plane. In vDS, the management plane is called the vDS, while the data plane is the vSS or vNetwork Standard Switch. The vSS is a superset of the standard local vSwitch on each ESX host that the vDS manages. In the N1KV, the management plane is called the VSM (Virtual Supervisor Module)(switch supervisor) and the the data plane is the VEM (Virtual Ethernet Module)(switch linecards), again a superset of the standard ESX local vSwitch.

In VMWare, a network is essentially represented by a network PortGroup. In Cisco terminology, the same is called a PortProfile. As the names indicate, a PortGroup is a group of switch ports that share the same properties, and similarly a PortProfile is the "profile" or set of properties of a switch port, and the same port profile can be applied on multiple switch ports.

Examples of properties are VLAN IDs, ACLs, network throttle rate, PVLAN IDs, type of switchport (trunk/access) and so on.

Thus, essentially, provisioning PVLANs on VMWare clusters involves creating portgroups/portprofiles and associating them with switchports on the vDS or the VSM and then associating vNICs on VMs with the appropriate port profile. prepareNetwork() and createPortProfile() functionality in HypervisorHostHelper will be modified accordingly.

Switch configuration

  • Though CloudStack didn't control switch, the switches must support Private VLAN in order to get the whole setup work. This would require certain Cisco Catalyst switches.
    • It's likely we would need Catalyst 4500 series for PVLAN promiscuous trunk support.
  • The topological of switch and router would be:
    • All L2 switch(which are aware of PVLAN) connected to each other, and one of them(and only one of them) connect to router.
    • All the ports connected to the host would be configured in trunk mode, allow mgmt vlan, primary vlan(public vlan) and secondary isolated vlan.
    • The switch port connect to the router would be configured in PVLAN promiscuous trunk mode, which would translate secondary isolated vlan to primary vlan for router(which doesn't have knowledge of PVLAN).
    • If your Catalyst switch support PVLAN but not PVLAN promiscuous trunk mode(AFAIK, only Catalyst 4500 series support the mode), you need:
      • 1. Configure one of switch port as trunk for mgmt network(mgmt vlan)
      • 2. For each PVLAN, connect one port of Catalyst switch to upper switch, set the port in Catalyst Switch in promiscuous mode for one pair of PVLAN, set the port in upper switch to "access" mode, only allow traffic of primary vlan of the PVLAN pair.

Web Services APIs

PVLAN can be enabled on shared networks. Shared networks are created by admin users, end user vms are allowed to have nics on shared networks.
Modify createNetworkCmd (for shared networks)

  • Add a new parameter: isolatedpvlan:
    • Not a required paramter. if the parameter is not null, then PVLAN would be enabled.
    • When the parameter is set, it must be advance shared network.

DB changes

Cisco Nexus 1000v specific changes

New functions to add primary and secondary VLANs to port groups, and to be called in HypervisorHostHelper.java, will need to be added to VsmCommand.java and NetconfHelper.java.

Phased implementation for VMware

For VMware, the project will be carried out in two phases. In phase 1, pvlan support will be implemented in cloudstack for VMware Distributed Virtual Switch configurations. In phase 2, pvlan support will be implemented for provisioning profiles on Cisco Nexus 1000v.

UI flow

  • The admin creates a shared vlan. She is asked if it is a PVLAN. If yes, in addition to the primary vlan id, she is asked for the secondary vlan id.
    The pre-confirmation dialog asks her to make sure her physical infrastructure is configured in the same fashion.

IP Clearance

  • what dependencies will you be adding to the project?

Appendix

Appendix A:

Appendix B:

  • No labels