Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Design

Api changes

Following API shall be introduced with container service:

  • createContainerCluster
    • name: name of container cluster
    • description: description of container cluster
    • zoneid: uuid of the zone in which container cluster will be provisioned
    • serviceofferingid: service offering with which cluster VM's shall be provisioned
    • cluster: size of the cluster or number of VM's to be provisioned
    • accountname: account for which container cluster shall be created
    • domainid: domain of the account for which container cluster shall be created
    • networkid: uuid of the network in to which container cluster VM's will be provisioned. If not specified container service shall provision a new isolated network with default isolated network offering with source nat service.
  • deleteContainerCluster
    • id: uuid of container cluster
  • startContainerCluster
    • id: uuid of container cluster
  • stopContainerCluster
    • id: uuid of container cluster
  • listContainerCluster
    • id: uuid of container cluster

New reponse 'containerclusterreponse' shall be added with below details

  • name
  • description
  • zoneid
  • serviceofferingid
  • networkid
  • clustersize
  • endpoint: URL of the container cluster manger API server endpoint 

 

Life cycle operations

Each of the life cycle operation is a workflow resulting in either provisioning or deleting multiple CloudStack resources. Its not possible to achieve atomicity. There is no guarantee a workflow of a life cycle operation will succeed due to lack of 2PC like model of resource reservation followed by provisioning semantics. Also there is no guarantee rollback getting succeeded. For e.g. while provisioning a cluster of size 10 VM's, deployment may run out of capacity to provision any more VM's after provisioning 5 Vm's . In which case as rollback provisioned VM's can be destroyed. But there can be cases where deleting a provisioned VM is not possible temporarily like disconnected hosts etc.

...

Below state machine reflects how container cluster state transitions for each of life cycle oerations

 

Below state machine captures the state of container cluster as it goes through various life-cycle operations. Not all states are necessarily end user visible.

Garbage collection

garbage collection will shall be implemented as a way to clean up the resources of container cluster, as a background task. Following are cases where cluster resources are freed up.

  • Starting container cluster fails, resulting in clean up of the provisioned resources (Starting → Expunging → Destroyed)
  • deleting container cluster (Stopped→ Expunging → Destroyed and Alert→ Expunging → Destroyed )

If there is failures in cleaning up resources, and clean up can not proceed, state of container cluster is marked in 'Expunge' state from 'Expunging' state.  Garbage collector will loop through the list of container clusters in 'Expunge' state periodically and try to free the resources held by container cluster.

OPEN QUESTION

should we care to implement rollback of failure in container cluster creation, or do a lazy cleanup. Which is to mark the container cluster to be in 'Expunging' state and let garbage collector do the cleanup. Its just matter of when to do it. Both the flows may be using same cleanup module.

Cluster state synchronization

State of the container cluster is 'desired state' of the cluster as intended by the user or what the system's logical view of the container cluster. However there are various scenarios where desired state of the container cluster is not sync with state that can be inferred from actual physical/infrastructure. For e.g a container cluster in 'Running' state with cluster size of 10 VM's all in running state. Its possible due to host failures, some of the VM's may get stopped at later point. Now the desired state of the container cluster is a cluster with 10 VM's  running and in operationally ready state (w.r.t to container provisioning), but the resource layer is state is different. So we need a mechanism to ensure:

  • cluster is in desired state at resource/infrastructure layer. Which could mean provision new VM's or delete VM's, in the cluster etc to ensure desired state of the container cluster
  • Conversely when reconciliation can not happen reflect the state of the cluster accordingly, and to recover at later point.

Following mechanism will be implemented.

  • A state 'Alert' will be maintained that indicates container cluster is not in its desired state.
  • A state synchronization background task will run periodically to infer if the cluster is in desired state. If not cluster will marked as alert state.
  • A recovery action try to recover the cluster

State transitions in FSM, where a container cluster ends up in 'Alert' state:

  • failure in middle of scale in/out, resulting in cluster size (# of VM's) not equal to the expected.
  • failure in stopping a cluster, leaving some VM's to be running state.
  • Difference of states as detected by the state synchronization thread.
Out-of-band changes

From layering perspective, CCS is like layered on top of CloudStack functionality. There is no way to control the life-cycle of individual resources that are part of container cluster. For e.g user can go and delete the VM's that are part of container cluster.

OPEN QUESTION There are no hooks to restrict this actions?

Only design option is to cluster state synchronization to figure missing entities (in case of destroyed VM's) or conflicting states (User can stop a VM, that is expected to be running by CCS) and put the cluster state in alert.

Policies can be defined on how to recover the cluster.

re-use cloud DB vs keep separate DB

 

 
Pros
cons
separate DB
  • clean separation. There is no specific advantages w.r.t data integrity for keeping the CCS DB as part of the cloud DB.
  • perceived no side affects on 'cloud' DB. although CCS plug-in can modify 'cloud'db
  • avoid possible side affect on CCS DB, during CloudStack DB upgrades
  • add new ORM to access the CCS DB
  • CloudStack ORM is tied to 'cloud' DB difficult to switch
reuse cloud DB and extend

schema

  • easiest path, leverage existing ORM
  • use FK, delete cascade etc for cross table references if possible
  • side affects on upgrades

 

Handling out-of-band

changes

:

 

CCS will keep below book keeping tables to store the cloudstack resources provisioned and used for a container cluster.

Note there are no foreign key and delete cascades. CCS should not loose book keeping data on the resources even if resource is deleted from the CloudStack DB.

CCS code need to do defensive coding to verify entity exist in CloudStack tables before using it.

 

 

CREATE TABLE IF NOT EXISTS `cloud`.`container_cluster` (
    `id` 

`id` bigint

 unsigned 

unsigned NOT

 

NULL

 

auto_increment

COMMENT 

COMMENT 'id',

    `uuid` 

`uuid` varchar(40),

    `name

`name` varchar(255)

 

NOT

 

NULL,

    `description` 

`description` varchar(4096)

 NULL COMMENT 'description

COMMENT 'display text for this container cluster',

    

`zone_

id` 

id` bigint

 unsigned 

unsigned NOT

 

NULL

 COMMENT 

COMMENT 'zone id',

    

`service_offering_

id` 

id` bigint

 

unsigned

COMMENT 

COMMENT 'service offering id for the cluster VM',

    

`template_

id` 

id` bigint

 

unsigned

COMMENT 

COMMENT 'vm_template.id',

    

`network_

id` 

id` bigint

 

unsigned

COMMENT 

COMMENT 'network this

public ip address is associated with

container cluster uses',

    

`node_

count` 

count` bigint

 

NOT

 

NULL

 

default

 

'0',

    

`account_

id` 

id` bigint

 unsigned 

unsigned NOT

 

NULL

 COMMENT 

COMMENT 'owner of this cluster',

    

`domain_

id` 

id` bigint

 unsigned 

unsigned NOT

 

NULL

 COMMENT 

COMMENT 'owner of this cluster',

    `state` 

`state` char(32)

 

NOT

 

NULL

 COMMENT 

COMMENT 'current state of this cluster',

    

`key_

pair` 

pair` varchar(40),

    `cores` 

`cores` bigint

 unsigned 

unsigned NOT

 

NULL

 COMMENT 

COMMENT 'number of cores',

    `memory` 

`memory` bigint

 unsigned 

unsigned NOT

 

NULL

 COMMENT 

COMMENT 'total memory',

    `endpoint` 

`endpoint` varchar(255)

COMMENT 

COMMENT 'url endpoint of the container cluster manager api access',

    

`console_

endpoint` 

endpoint` varchar(255)

COMMENT 

COMMENT 'url for the container cluster manager dashbaord',

 
--   

`created` datetime NOT NULL COMMENT 'date created',
`removed` datetime COMMENT 'date removed if not null',
`gc` tinyint unsigned NOT NULL DEFAULT 1 COMMENT 'gc this container cluster or not',

CONSTRAINT `fk_cluster__zone_id` FOREIGN KEY `fk_cluster__zone_id` (`zone_id`) REFERENCES `data_center` (`id`) ON DELETE CASCADE,

--   

CONSTRAINT `fk_cluster__service_offering_id` FOREIGN KEY `fk_cluster__service_offering_id` (`service_offering_id`) REFERENCES `service_offering`(`id`) ON DELETE CASCADE,

--   

CONSTRAINT `fk_cluster__template_id` FOREIGN KEY `fk_cluster__template_id`(`template_id`) REFERENCES `vm_template`(`id`) ON DELETE CASCADE,

--   

CONSTRAINT `fk_cluster__network_id` FOREIGN KEY `fk_cluster__network_id`(`network_id`) REFERENCES `networks`(`id`) ON DELETE CASCADE,

 

    

PRIMARY

 

KEY(`id`)
) ENGINE=

InnoDB 

InnoDB DEFAULT

 

CHARSET=utf8;

 

CREATE

 

TABLE

 IF 

IF NOT

 

EXISTS `cloud`.`container_cluster_vm_map` (

    `id` 

`id` bigint

 unsigned 

unsigned NOT

 

NULL

 

auto_increment

COMMENT 

COMMENT 'id',

    

`cluster_

id` 

id` bigint

 unsigned 

unsigned NOT

 

NULL

 COMMENT 

COMMENT 'cluster id',

    

`vm_

id` 

id` bigint

 unsigned 

unsigned NOT

 

NULL

 COMMENT 

COMMENT 'vm id',

 

    

PRIMARY

 

KEY(`id`),

 
--   

CONSTRAINT `container_cluster_vm_map_cluster__id` FOREIGN KEY `container_cluster_vm_map_cluster__id`(`cluster_id`

) REFERENCES 

) REFERENCES `sb_ccs_container_cluster`(`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE IF NOT EXISTS `cloud`.`container_cluster_details` (
`id` bigint unsigned NOT NULL auto_increment COMMENT 'id',
`cluster_id` bigint unsigned NOT NULL COMMENT 'cluster id',
`username` varchar(255) NOT NULL,
`password` varchar(255) NOT NULL,
`registry_username` varchar(255),
`registry_password` varchar(255),
`registry_url` varchar(255),
`registry_email` varchar(255),
`network_cleanup` tinyint unsigned NOT NULL DEFAULT 1 COMMENT 'true if network needs to be clean up on deletion of container cluster. Should be false if user specfied network for the cluster',

PRIMARY KEY(`id`),
CONSTRAINT `container_cluster_details_cluster__id` FOREIGN KEY `container_cluster_details_cluster__id`(`cluster_id`) REFERENCES `sb_ccs_container_

`container_

cluster`(`id`)

 

ON

 

DELETE

 

CASCADE

 


) ENGINE=

InnoDB 

InnoDB DEFAULT

 

CHARSET=utf8;

 

 

References