Container technologies are gaining quite a momentum and changing the way how application are traditionally deployed in the public and private clouds. Gaining interest in micro services based architecture also fostering adaption of container technologies. Like how cloud orchestration platforms enabled provisioning of VM's and adjunct services, container orchestration platforms like Kubernetes, docker swarm, mesos are emerging to enable orchestration of containers. Container orchestration platforms typically can be run any where and can be used to provision containers. A popular choice of running containers has been running them on the IAAS provisioned VM's. AWS and GCE provides native functionality to launch containers abstracting underlying consumption of VM's. There are couple efforts to provision a container orchestration platforms on top of CloudStack, but they are not out of the box solution. Given the momentum of container technologies, miro-services etc it make sense to provide a native functionality in CloudStack which is available out-of-the-box for users.
Purpose of this document is present the functional requirements for supporting native functionality in CloudStack to provision containers and detail design aspects of how the functionality will be achieved.
CloudStack container service shall introduce the notion of container cluster. A 'container cluster' shall be first class CloudStack entity that will be a composite of existing CloudStack entities like virtual machines, network, network rules etc. Container service shall stitch together container cluster resources, and deploys chosen cluster manager like Kubernetes, Mesos, docker swarm etc to provide a container service like AWS ECS, Google container service etc to the CloudStack users.
Container service shall provide following container cluster life-cycle operations.
As part of container cluster creation, container service shall be responsible for setting up control place of container orchestrator that was choosen.
Following API shall be introduced with container service:
New reponse 'containerclusterreponse' shall be added with below details
Each of the life cycle operation is a workflow resulting in either provisioning or deleting multiple CloudStack resources. Its not possible to achieve atomicity. There is no guarantee a workflow of a life cycle operation will succeed due to lack of 2PC like model of resource reservation followed by provisioning semantics. Also there is no guarantee rollback getting succeeded. For e.g. while provisioning a cluster of size 10 VM's, deployment may run out of capacity to provision any more VM's after provisioning 5 Vm's . In which case as rollback provisioned VM's can be destroyed. But there can be cases where deleting a provisioned VM is not possible temporarily like disconnected hosts etc.
Below approach is followed.
Do a best effort rollback for a life cycle operation in case of failure
In case rollback fails, have reconciliation mechanisms that will ensure eventual consistency
Below state machine reflects how container cluster state transitions for each of life cycle oerations
garbage collection shall be implemented as a way to clean up the resources of container cluster, as a background task. Following are cases where cluster resources are freed up.
If there is failures in cleaning up resources, and clean up can not proceed, state of container cluster is marked in 'Expunge' state from 'Expunging' state. Garbage collector will loop through the list of container clusters in 'Expunge' state periodically and try to free the resources held by container cluster.
State of the container cluster is 'desired state' of the cluster as intended by the user or what the system's logical view of the container cluster. However there are various scenarios where desired state of the container cluster is not sync with state that can be inferred from actual physical/infrastructure. For e.g a container cluster in 'Running' state with cluster size of 10 VM's all in running state. Its possible due to host failures, some of the VM's may get stopped at later point. Now the desired state of the container cluster is a cluster with 10 VM's running and in operationally ready state (w.r.t to container provisioning), but the resource layer is state is different. So we need a mechanism to ensure:
Following mechanism will be implemented.
State transitions in FSM, where a container cluster ends up in 'Alert' state:
CREATE TABLE IF NOT EXISTS `cloud`.`container_cluster` ( `id` bigint unsigned NOT NULL auto_increment COMMENT 'id', `uuid` varchar(40), `name` varchar(255) NOT NULL, `description` varchar(4096) COMMENT 'display text for this container cluster', `zone_id` bigint unsigned NOT NULL COMMENT 'zone id', `service_offering_id` bigint unsigned COMMENT 'service offering id for the cluster VM', `template_id` bigint unsigned COMMENT 'vm_template.id', `network_id` bigint unsigned COMMENT 'network this container cluster uses', `node_count` bigint NOT NULL default '0', `account_id` bigint unsigned NOT NULL COMMENT 'owner of this cluster', `domain_id` bigint unsigned NOT NULL COMMENT 'owner of this cluster', `state` char(32) NOT NULL COMMENT 'current state of this cluster', `key_pair` varchar(40), `cores` bigint unsigned NOT NULL COMMENT 'number of cores', `memory` bigint unsigned NOT NULL COMMENT 'total memory', `endpoint` varchar(255) COMMENT 'url endpoint of the container cluster manager api access', `console_endpoint` varchar(255) COMMENT 'url for the container cluster manager dashbaord', `created` datetime NOT NULL COMMENT 'date created', `removed` datetime COMMENT 'date removed if not null', `gc` tinyint unsigned NOT NULL DEFAULT 1 COMMENT 'gc this container cluster or not', CONSTRAINT `fk_cluster__zone_id` FOREIGN KEY `fk_cluster__zone_id` (`zone_id`) REFERENCES `data_center` (`id`) ON DELETE CASCADE, CONSTRAINT `fk_cluster__service_offering_id` FOREIGN KEY `fk_cluster__service_offering_id` (`service_offering_id`) REFERENCES `service_offering`(`id`) ON DELETE CASCADE, CONSTRAINT `fk_cluster__template_id` FOREIGN KEY `fk_cluster__template_id`(`template_id`) REFERENCES `vm_template`(`id`) ON DELETE CASCADE, CONSTRAINT `fk_cluster__network_id` FOREIGN KEY `fk_cluster__network_id`(`network_id`) REFERENCES `networks`(`id`) ON DELETE CASCADE, PRIMARY KEY(`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE IF NOT EXISTS `cloud`.`container_cluster_vm_map` ( `id` bigint unsigned NOT NULL auto_increment COMMENT 'id', `cluster_id` bigint unsigned NOT NULL COMMENT 'cluster id', `vm_id` bigint unsigned NOT NULL COMMENT 'vm id', PRIMARY KEY(`id`), CONSTRAINT `container_cluster_vm_map_cluster__id` FOREIGN KEY `container_cluster_vm_map_cluster__id`(`cluster_id`) REFERENCES `container_cluster`(`id`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8; CREATE TABLE IF NOT EXISTS `cloud`.`container_cluster_details` ( `id` bigint unsigned NOT NULL auto_increment COMMENT 'id', `cluster_id` bigint unsigned NOT NULL COMMENT 'cluster id', `username` varchar(255) NOT NULL, `password` varchar(255) NOT NULL, `registry_username` varchar(255), `registry_password` varchar(255), `registry_url` varchar(255), `registry_email` varchar(255), `network_cleanup` tinyint unsigned NOT NULL DEFAULT 1 COMMENT 'true if network needs to be clean up on deletion of container cluster. Should be false if user specfied network for the cluster', PRIMARY KEY(`id`), CONSTRAINT `container_cluster_details_cluster__id` FOREIGN KEY `container_cluster_details_cluster__id`(`cluster_id`) REFERENCES `container_cluster`(`id`) ON DELETE CASCADE ) ENGINE=InnoDB DEFAULT CHARSET=utf8;