Repository model analysis
Excerpt |
---|
We use different object repositories to store the actual information that is needed for the provisioning server. This analysis takes the object repository analysis as a basis and defines what data goes in what repository. |
Roles
The first step in this analysis is to identify the roles and use cases that are related to the information that needs to be stored. As can be seen below, some of the roles have multiple operations they can perform and access control might restrict some of those rights in some cases.
- The release manager is responsible for managing the store repository. He:
- adds artifacts;
- creates licenses;
- associates the two.
- The gateway operator manages a set of gateways designated to him. He:
- creates gateways;
- associates licenses to gateways.
- A field service engineer physically goes to gateways that are not connected to the provisioning system directly to deploy software on them.
- The remote gateway operator manages a set of gateways too, but only gets access to a limited set of licenses.
- The license manager is a role that needs to be further defined (Fred will come up with a user story, that will have to be incorporated into this scenario).
Scenarios
All scenarios are based on the fact that you work with repositories. Repositories can be checked out, optionally modified and committed back. Only data that has been committed is propagated through the system, you cannot base anything on local modified working copies. When committing changes, you can get conflicts. This happens if someone else committed something while you were editing. You end up having to either checkout the latest version and re-apply your changes, or doing a three way merge before trying again. When editing, a notification system that sends out messages about who's editing or committing what might help you detect such conflicts before they occur. In a highly distributed system we cannot prevent them though.
Looking at the topology of the system, we have three types of nodes:
- clients, that have a UI and are used by humans to manipulate data in one or more repositories;
- servers, that contain repositories and can replicate data, where servers can also be relay servers;
- gateways, containing the management agent and the installed software components.
On-line scenario
The on-line scenario has the simplest network topology. Everything is on-line. You have two roles: release manager and gateway operator. The release manager is responsible for the store, linking bundles to licenses. He can see a list of gateways in the system to determine the impact of his edits on these gateways, before he actually commits them. In fact he can see two statusses here: the status of a gateway compared to what is approved and the status compared to what is currently on the gateway (where the latter is derived from the audit log information). The gateway operator creates gateways and links them to licenses. Furthermore, at some point, the gateway operator will detect that for some gateways, updates exist (by looking at both the store and the gateway operator repository). Per gateway, he can now approve such a set of bundles. This set will be tagged with a version for that gateway and eventually get installed by the management agent.
By giving each role its own repository, both can work independent of each other. By recording the approval of sets of bundles in a separate repository, you can distribute that information without having to distribute the other repositories. On the other hand, for the on-line scenario, as long as everything is on the same server anyway, you could still put everything in one repository. You will get more conflics then, when a release manager is reorganizing the store, whilst the gateway operator is adding gateways and approving versions for others. Whilst some of these changes logically don't conflict, you will always have to spend the effort of actually performing such merges. By splitting up the repositories, you reduce the number of merges considerably.
Off-line scenario
The off-line scenario has a network topology where the gateways are never directly linked to the internet (and therefore the (relay)server). This scenario introduces the role of the field service engineer, who goes to a gateway and connects to it locally. The roles of both the release manager and (remote) gateway operator are no different from the on-line scenario, because neither is disconnected from the network. Off-line applies to the gateways and the FSE role only.
For the FSE to provision software to a gateway, he needs to have at least a versioned set of bundles for that gateway on his computer. The management agent then decides if this is indeed a newer version (another FSE might have beat him to it) and will request it when it is. From the management agent's point of view, versions can only go up.
The FSE has three other options when he's connected to a gateway:
- Rollback to an older version. This means the gateway is put in "pinned" state where it no longer updates. Instead it stays at a fixed version (the pinned version). Of course, this version either has to be available from the cache of the gateway, or because it's on the FSE's computer.
- Mark a version as "bad" for this gateway. This is a variation on the option above, but it does not "pin" the gateway to a specific version. Instead it just marks a version as bad, which means that that version will no longer be deployed. Effectively, that filters out such versions from the list of "available versions" for that gateway. In other words, if you have versions 1, 2 and 3, and you mark 3 as bad, it will go to 2. However, as soon as 4 becomes available, it will go to 4. Marking versions as bad will also be reflected in the audit log.
- Put the gateway in an "unmanaged" state, where the FSE can deploy whatever he wants. Until you change the state, it will remain in that "unmanaged" state. It's the responsibility of the FSE to make sure that the software that is on the gateway eventually ends up in the system again (because he might have created some bundles on the fly, bundles that don't exist anywhere, not even in source form).
The off-line scenario dictates that you must be able to take a copy of a repository with you that contains a versioned (and approved) set of bundles for a gateway. You don't need to take everything with you. By having the store and gateway operator repositories as separate entities, you can skip copying those. Also, by having the repository with the versioned sets, you don't need to do calculations anymore when a gateway connects. The interaction is simple, the gateway checks if you have a newer version than he and if you do, the gateway will get that set of bundles and try to install it.
Region scenario
The region scenario has a network topology where software is managed centrally and gateways are managed per region, meaning each region has its own set of gateways.
For the release manager this makes little difference compared to a scenario with only one region, since the release manager does not manage gateways anyway. The only thing he does is look at gateways and see if the actions he performs cause changes to these gateways. In the regional scenario this means he should have an overview of all gateways that are managed by the different regions.
For the gateway operator, the difference is that he sees and manages only his own gateways. Gateways for other regions are (probably) invisble to him (or at least he cannot modify them).
One way to implement region scenario is to split the gateway operator repository in regions. That way each region can operate independently of the rest. This solves the problem of having to "hide" gateways from certain gateway operators and again, it reduces the number of merges and merge conflicts you might have.
Multi-store scenario
This is a scenario where a gateway gets software (in the form of licenses) from more than one store. For the release manager of each individual shop, this makes little difference. The only thing you might miss, as a release manager, is the ability to fully see what's installed on a gateway (if you're not allowed to see read-only copies of the other stores).
The gateway operator now has more than one store to choose licenses from. Links to licenses in the gateway operator repository will have to be tuples linking to both the unique identification of the store repository and the name of the license.
Models
When we combine these scenarios we get the following entity model:
The next step is to partition these entities into one or more repositories. The two extreme scenarios we have are:
- all entities go into one single repository;
- every entity (and relation) has its own repository.
Downsides of the first alternative are:
- everybody's working on the same repository, which means we run the risk of getting more merge conflicts;
- one single repository is harder to distribute, effectively everybody will need to be connected to the same server.
Downside of the second alternative:
- as soon as you have to update more than one repository at the same time, you need transaction support over multiple repositories, and we don't have that.
So we need to find a balance between these two alternatives, which leads to the following partitioning of the repositories:
- The "bundle repository", which we map onto an OBR. Note that for other artifacts, we probably link to another type of external repository here.
- The "store repository", which links bundles (or other artifacts) to licenses. For bundles and other artifacts, only the metadata is stored, combined with a link to the actual data.
- The "gateway operator repository", which links licenses to gateways. Licenses are stored as references here (a reference to a license in a store). Gateways are created here.
- The "deployment repository", that links gateways to versioned sets of bundles.
You can have multiple OBR's, multiple stores, multiple gateway operator repositories and multiple deployment repositories. Gateways are globally unique and can only exist in one repository.
Store repository
The store is where you create new bundles (or other artifacts) and licenses. The store implementation can contain different grouping mechanisms.
For bundles or artifacts in general, we only store the meta-data in the store. This data links to some URL where the actual bundle or artifact can be found (which might be inside an OBR or anywhere else that can be addressed by a URL).
Licenses are created in the store.
Associations between bundles and licenses can either by static or dynamic. The latter means that as soon as a new version of an existing bundle is added, it automatically gets associated to licenses that formerly contained an association with an older version of that bundle.
Gateway operator repository
The gateway operator repository is where you create new gateways and link them to existing licenses. You can even link to licenses from different stores.
Gateways are created here and include metadata like the unique identification of the gateway, together with other relevant meta-data.
You link to existing licenses in stores. In the normal case, there will be only one store, but you might have more than one store. In any case, the unique identification of a license is the tuple (repository ID, license ID).
Deployment repository
The deployment repository is where you store the latest approved set of bundles for each gateway. This set is labeled with a version. Also, the set refers to the versions of all repositories that were involved in its creation (usually a store and a license repository, but there might be more than one store involved). A version gets a name that conforms to the following pattern: "major.minor.micro_goversion". Major, minor and micro are determined by first determining what bundles have changed and then to pick the "biggest" version change for any one bundle. The "goversion" is the version of the gateway operator store that stores this approved version, which was added to avoid having to go through all older gateway operator repositories one by one.