The simulator changes makes it easy to write tests for various scenarios that was not possible earlier, especially negative/fault scenarios. Now it is possible to write end-to-end tests for VM deployment retry logic, HA, migration etc. This document takes some of these scenarios to explain the simulator changes.
First ensure that there is already a cluster with at least 2 hosts in it.
self.hosts = []
suitablecluster = None
clusters = Cluster.list(self.apiclient)
self.assertTrue(isinstance(clusters, list) and len(clusters) > 0, msg = "No clusters found")
for cluster in clusters:
self.hosts = Host.list(self.apiclient, clusterid=cluster.id, type='Routing')
if isinstance(self.hosts, list) and len(self.hosts) >= 2:
suitablecluster = cluster
break
self.assertTrue(isinstance(self.hosts, list) and len(self.hosts) >= 2, msg = "Atleast 2 hosts required in cluster for VM HA test")
Tag the hosts in the cluster, so that HA enabled VM can be deployed in this
#update host tags
for host in self.hosts:
Host.update(self.apiclient, id=host.id, hosttags=self.testdata["service_offering"]["hasmall"]["hosttags"])
Deploy HA VM
#deploy ha vm
self.virtual_machine = VirtualMachine.create(
self.apiclient,
self.testdata["virtual_machine"],
accountid=self.account.name,
zoneid=self.zone.id,
domainid=self.account.domainid,
serviceofferingid=self.service_offering.id,
templateid=self.template.id)
Now in order to simulate host failure where HA VM is running, following mocks needs to be created. The above call says that create a mock for the agent command 'PingCommand' to return failure (result:fail) for the agent/resource identified by zoneid, podid, clusterid, hostid. Possible values for 'result' can be fail/fault. To create a mock with generic scope don't specify anything for hostid, clusterid, podid, zoneid in that specific order. For e.g. to create mock for all hosts in a cluster specify clusterid, podid and zoneid only. All these mock are persisted in the mock configuration table in simulator DB.
self.mock_ping = SimulatorMock.create(
apiclient=self.apiclient,
command="PingCommand",
zoneid=suitablecluster.zoneid,
podid=suitablecluster.podid,
clusterid=suitablecluster.id,
hostid=self.virtual_machine.hostid,
value="result:fail")
After 3 ping failures, investigation happens. First a 'CheckHealthCommand' is issued to check the health of the host for which 'PingCommand' failed. After that the various investigators are invoked to check if the host is alive. The investigation stops whenever an investigator is able to conclusively determine the state of the host. There is a simulator investigator which does this by issuing 'CheckOnHostCommand' from other hosts (in 'Up' state) in cluster. If the investigator returns host status as 'Down' then HA is triggered for HA enabled VMs.
self.mock_checkhealth = SimulatorMock.create(
apiclient=self.apiclient,
command="CheckHealthCommand",
zoneid=suitablecluster.zoneid,
podid=suitablecluster.podid,
clusterid=suitablecluster.id,
hostid=self.virtual_machine.hostid,
value="result:fail")
self.mock_checkonhost_list = []
for host in self.hosts:
if host.id != self.virtual_machine.hostid:
self.mock_checkonhost_list.append(SimulatorMock.create(
apiclient=self.apiclient,
command="CheckOnHostCommand",
zoneid=suitablecluster.zoneid,
podid=suitablecluster.podid,
clusterid=suitablecluster.id,
hostid=host.id,
value="result:fail"))
HA process is triggered and as part of restarting the VM on another host, first there is check to see if the VM is alive using the 'CheckVirtualMachineCommand' again using the various investigators.
self.mock_checkvirtualmachine = SimulatorMock.create(
apiclient=self.apiclient,
command="CheckVirtualMachineCommand",
zoneid=suitablecluster.zoneid,
podid=suitablecluster.podid,
clusterid=suitablecluster.id,
hostid=self.virtual_machine.hostid,
value="result:fail")
This mock is there to prevent the UserVmDomRInvestigator from determining host state. Note that for this mock cluster and host is not passed implying that the scope is for entire pod.
self.mock_pingtest = SimulatorMock.create(
apiclient=self.apiclient,
command="PingTestCommand",
zoneid=suitablecluster.zoneid,
podid=suitablecluster.podid,
value="result:fail")
In the actual test there is a wait for HA to happen. Then there is a validation to see that the HA VM has moved to another host in the cluster.
In this case while creating the mocks, count parameter is not used. This is used to make sure that the mock is active only for 'count' times. Every time a mock successfully executes count is decremented by 1. This parameter can be used to make sure that the mock actually got executed as expected and that the test failure is due to the mock and due to some other issues. For an e.g. take a look at test/integration/smoke/test_deploy_vm.py where there are tests for VM deployment retry logic.
As part of cleanup, clear all the mocks created during setup so that subsequent tests are not impacted by them. The mocks are cleaned up by simply updating the 'removed' field.