CloudStack Chimp

Abstract

This design document proposes a new CloudStack upgrade and db migration (cmdline) tool, "CloudChimp" or "cloudchimp". A similar attempt related to DB was made in the past circa 4.2.

The aim is build a new upgrade and db migration tool that supports both the development and productions workflow alike – allowing a user to do deterministic database upgrade version A to version B. Currently, the database migration model employed by CloudStack is brittle – often breaking on upgrade. Furthermore, the upgrade paths supported are limited and do not cover all of the potential user scenarios, and due to the branching and development of several major and minor versions going on in-parallel upgrade paths are only possible or available in future releases.

High Level Goals and Requirements

Same tool to be used during development and during operational deployment, right now the DatabaseCreator (java) does upgrades or db deployment for developers, while cloudstack-setup-databases (python) does it for users.
The database migration and upgrade logic needs to be separated from the management server
Tool would allow for upgrades possible from any version A to version B such that A < B, with the limitation that operations are not reversible (we'll need to explore if rollbacks can be supported, but given dependence of cloudstack on systemvm versions etc, it might be a hard problem) and the assumption here is that the sysadmin needs to backup their database before using the tool
Tool provides a dry-run mode to show high level operations it will be doing without actually executing them
Tool maintains a schema changes log (sort of like a source control), that is decoupled from dependence on CloudStack version
The tool would allow developers to write idempotent upgrade paths and typed sql queries (or some type/syntax checking built so they changes can be verified), this means if a path or sql query is ran again it should result in the same final state
Right now upgrading systemvm is an issue, the upgrade paths break if a systemvm is not found or properly setup which leaves the database in a messed up state. We would see how this can be fixed either using this tool would aim to remedy this use-case.
Good to have feature - Right now after upgrading CloudStack, there is no sanity checking process; we might want to identify what basic sanity checks we should do and then implement some basic data integrity checking that the tool can do for us (open for discussion and scope).

Methodology

The bridges over the legacy upgrade infra and the new/future versions
We can start using this tool in 4.7 or above, that version becomes the pivot or bridge point. We might decide to not support older legacy db upgrades at all, and have users upgrade to this pivot point/version after which they can use the new tool (this of course needs some discussion and experimentation).
This we need to discuss, my thoughts right are in the line of – We refactor out this tool in its separate repository where it is maintained on a single branch (master, so no branching) like cloudmonkey; with each CloudStack release we release this tool. For each branch where we want this tool, we can use git sub-tree or submodule to get this repo in the CloudStack repo so developers don't have to use a separate repo or change their development pattern too far.

Development

No explicit upgrade path mapping, right now for each upgrade path UpgradeXtoY java class implements the path and a map of path going from version A to B is maintained separately.
The tool can view the upgrade paths (UpgradeXtoY classes) as edges and nodes are version (String version), and the upgrade graph become a directed acyclic graph (DAG) where paths can be found using a shortest path finding algorithm (Dijkstra's) between version A and B.

Usage

List of cmd line arguments to support:
- CloudStack DB name, user, password; DB host ip and port
- Dry run option
- Verbose option (shows sql queries or operations)
- Upgrade related options etc.
List of operations:
- init: deploy a fresh db
- create: create a new migration, scaffolding for developers
- log: prints the db schema change log
- migrate: migrate from X to Y (with --dry-run option)
- status: status current cloudstack db etc.
- check: perform some sanity checks? (good to have)
- upgradesystemvm (good to have)
- adhoc: run adhoc sql queries? (good to have?)
Typical upgrade/migration process:

- Operator shutdowns all CloudStack mgmt and usage server
- Backs up the database
- Installs or upgrades the latest tool
- Specifies the CloudStack version or change/sql number they want to upgrade to and run the tool in dry run mode. The tool performs checks to figure out things such as if systemvms are properly setup etc, summarizes upgrade results.
- If all goes well, operator upgrade the db and cloudstack mgmt server among other packages.
- The operator or user goes to the next meetup or CCC conf or on ML to share their experience or report issues