Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Purpose of this document is present the functional requirements for supporting native functionality in CloudStack to provision containers and detail design aspects of how the functionality will be achieved.

Scope

Scope of this proposal is limited to using kubernetes a container cluster manager.

Functional specification

Container Cluster

...

Garbage collection

garbage collection shall be implemented as a way to clean up the resources of container cluster, as a background task. Following are cases where cluster resources are freed up.

  • Starting container cluster fails, resulting in clean up of the provisioned resources (Starting → Expunging → Destroyed)
  • deleting container cluster (Stopped→ Expunging → Destroyed and Alert→ Expunging → Destroyed )

If there is failures in cleaning up resources, and clean up can not proceed, state of container cluster is marked in 'Expunge' state from 'Expunging' state.  Garbage collector will loop through the list of container clusters in 'Expunge' state periodically and try to free the resources held by container cluster.

Cluster state synchronization

State of the container cluster is 'desired state' of the cluster as intended by the user or what the system's logical view of the container cluster. However there are various scenarios where desired state of the container cluster is not sync with state that can be inferred from actual physical/infrastructure. For e.g a container cluster in 'Running' state with cluster size of 10 VM's all in running state. Its possible due to host failures, some of the VM's may get stopped at later point. Now the desired state of the container cluster is a cluster with 10 VM's  running and in operationally ready state (w.r.t to container provisioning), but the resource layer is state is different. So we need a mechanism to ensure:

  • cluster is in desired state at resource/infrastructure layer. Which could mean provision new VM's or delete VM's, in the cluster etc to ensure desired state of the container cluster
  • Conversely when reconciliation can not happen reflect the state of the cluster accordingly, and to recover at later point.

Following mechanism will be implemented.

  • A state 'Alert' will be maintained that indicates container cluster is not in its desired state.
  • A state synchronization background task will run periodically to infer if the cluster is in desired state. If not cluster will marked as alert state.
  • A recovery action try to recover the cluster

State transitions in FSM, where a container cluster ends up in 'Alert' state:

  • failure in middle of scale in/out, resulting in cluster size (# of VM's) not equal to the expected.
  • failure in stopping a cluster, leaving some VM's to be running state.
  • Difference of states as detected by the state synchronization thread.

provisioning kubernetes container cluster manager

Core OS template shall be used to provision container cluster VM. Setting up a cluster VM as master/node of kubernetes is done through cloud-config script in CoreOS. CloudStack shall pass necessary cloud config script as base 64 encoded user data. Cloud-con

schema changes

 

Code Block
languagesql
CREATE TABLE IF NOT EXISTS `cloud`.`container_cluster` (
    `id` bigint unsigned NOT NULL auto_increment COMMENT 'id',
    `uuid` varchar(40),
    `name` varchar(255) NOT NULL,
    `description` varchar(4096) COMMENT 'display text for this container cluster',
    `zone_id` bigint unsigned NOT NULL COMMENT 'zone id',
    `service_offering_id` bigint unsigned COMMENT 'service offering id for the cluster VM',
    `template_id` bigint unsigned COMMENT 'vm_template.id',
    `network_id` bigint unsigned COMMENT 'network this container cluster uses',
    `node_count` bigint NOT NULL default '0',
    `account_id` bigint unsigned NOT NULL COMMENT 'owner of this cluster',
    `domain_id` bigint unsigned NOT NULL COMMENT 'owner of this cluster',
    `state` char(32) NOT NULL COMMENT 'current state of this cluster',
    `key_pair` varchar(40),
    `cores` bigint unsigned NOT NULL COMMENT 'number of cores',
    `memory` bigint unsigned NOT NULL COMMENT 'total memory',
    `endpoint` varchar(255) COMMENT 'url endpoint of the container cluster manager api access',
    `console_endpoint` varchar(255) COMMENT 'url for the container cluster manager dashbaord',
    `created` datetime NOT NULL COMMENT 'date created',
    `removed` datetime COMMENT 'date removed if not null',
    `gc` tinyint unsigned NOT NULL DEFAULT 1 COMMENT 'gc this container cluster or not',
    CONSTRAINT `fk_cluster__zone_id` FOREIGN KEY `fk_cluster__zone_id` (`zone_id`) REFERENCES `data_center` (`id`) ON DELETE CASCADE,
    CONSTRAINT `fk_cluster__service_offering_id` FOREIGN KEY `fk_cluster__service_offering_id` (`service_offering_id`) REFERENCES `service_offering`(`id`) ON DELETE CASCADE,
    CONSTRAINT `fk_cluster__template_id` FOREIGN KEY `fk_cluster__template_id`(`template_id`) REFERENCES `vm_template`(`id`) ON DELETE CASCADE,
    CONSTRAINT `fk_cluster__network_id` FOREIGN KEY `fk_cluster__network_id`(`network_id`) REFERENCES `networks`(`id`) ON DELETE CASCADE,
    PRIMARY KEY(`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


CREATE TABLE IF NOT EXISTS `cloud`.`container_cluster_vm_map` (
    `id` bigint unsigned NOT NULL auto_increment COMMENT 'id',
    `cluster_id` bigint unsigned NOT NULL COMMENT 'cluster id',
    `vm_id` bigint unsigned NOT NULL COMMENT 'vm id',
    PRIMARY KEY(`id`),
    CONSTRAINT `container_cluster_vm_map_cluster__id` FOREIGN KEY `container_cluster_vm_map_cluster__id`(`cluster_id`) REFERENCES `container_cluster`(`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


CREATE TABLE IF NOT EXISTS `cloud`.`container_cluster_details` (
    `id` bigint unsigned NOT NULL auto_increment COMMENT 'id',
    `cluster_id` bigint unsigned NOT NULL COMMENT 'cluster id',
    `username` varchar(255) NOT NULL,
    `password` varchar(255) NOT NULL,
    `registry_username` varchar(255),
    `registry_password` varchar(255),
    `registry_url` varchar(255),
    `registry_email` varchar(255),
    `network_cleanup` tinyint unsigned NOT NULL DEFAULT 1 COMMENT 'true if network needs to be clean up on deletion of container cluster. Should be false if user specfied network for the cluster',
    PRIMARY KEY(`id`),
    CONSTRAINT `container_cluster_details_cluster__id` FOREIGN KEY `container_cluster_details_cluster__id`(`cluster_id`) REFERENCES `container_cluster`(`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

 

...