Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

As part of container cluster creation, container service shall be responsible for setting up control place of container orchestrator that was choosen.

Design

Api changes

Following API shall be introduced with container service:

  • createContainerCluster
    • name: name of container cluster
    • description: description of container cluster
    • zoneid: uuid of the zone in which container cluster will be provisioned
    • serviceofferingid: service offering with which cluster VM's shall be provisioned
    • cluster: size of the cluster or number of VM's to be provisioned
    • accountname: account for which container cluster shall be created
    • domainid: domain of the account for which container cluster shall be created
    • networkid: uuid of the network in to which container cluster VM's will be provisioned. If not specified container service shall provision a new isolated network with default isolated network offering with source nat service.
  • deleteContainerCluster
    • id: uuid of container cluster
  • startContainerCluster
    • id: uuid of container cluster
  • stopContainerCluster
    • id: uuid of container cluster
  • listContainerClusterlistContainerClusterCACert
    • id: uuid of container cluster

New reponse 'containerclusterreponse' shall be added with below details

  • name
  • description
  • zoneid
  • serviceofferingid
  • networkid
  • clustersize
  • endpoint: URL of the container cluster manger API server endpoint 

 

Life cycle operations

Each of the life cycle operation is a workflow resulting in either provisioning or deleting multiple CloudStack resources. Its not possible to achieve atomicity. There is no guarantee a workflow of a life cycle operation will succeed due to lack of 2PC like model of resource reservation followed by provisioning semantics. Also there is no guarantee rollback getting succeeded. For e.g. while provisioning a cluster of size 10 VM's, deployment may run out of capacity to provision any more VM's after provisioning 5 Vm's . In which case as rollback provisioned VM's can be destroyed. But there can be cases where deleting a provisioned VM is not possible temporarily like disconnected hosts etc.

Below approach is followed.

Do a best effort rollback for a life cycle operation in case of failure

In case rollback fails, have reconciliation mechanisms that will ensure eventual consistency

 

Below state machine reflects how container cluster state transitions for each of life cycle oerations

 

Below state machine captures the state of container cluster as it goes through various life-cycle operations. Not all states are necessarily end user visible.

...