Object Repository Analysis

The object repository is a generalization of the type of repository we need to implement the store, license and version repositories. Because they are so similar, they are analyzed together.

Problem

An object repository contains versioned objects. We can split this analysis in two parts or layers:

  1. The repository itself, which just deals with storing and retrieving versioned objects.
  2. The manipulation of these objects.

The repository has the following use cases:

  • Checkout, which checks out a specific (or the latest, if you don't specify anything) version.
  • Update, which updates an existing object to the latest one (or a specific one, if you specify a version).
  • Commit, which commits the changed object to a new version, possibly dealing with conflicts in the process.

What manipulation is actually required, depends:

  • A client will do the most complex manipulations:
    • filtering the graph to show only relevant subsets
    • querying the graph to find certain objects
    • editing the graph
    • access control that restricts visibility and/or edit-ability of the graph
  • A relay server will only get part of the graph
  • A server, or anything that talks to management agents, will do specific queries

Context

Possible solutions

For the repository, it probably makes sense to have both stream based and object access (similar to SAX and DOM). In general we assume the whole graph fits in memory. If not, then stream based access should be enough. For stream based access, we could use XStream.

For the manipulation, we need a system that can do queries on streams, and a system that can do it on object graphs in memory. We have several options here:

  • JXPath, which can operate on XML or object graphs.
  • JoSQL, a SQL dialect that operates on object graphs.
  • LDAP queries, as included in OSGi.

Also, we need to define where to do the actual access control. We have two choices here:

  1. on the server
  2. on the client

Doing it on the client seems to be the most straightforward, and analog to what we've been doing in the past. On the other hand, if we assume we don't trust the client, we might want to do everything on the server. Doing everything on the server does mean you're sending incomplete graphs to the client. That might complicate manipulations.

Discussion

Using XStream makes a lot of sense. We can use it both to stream over the network, and to store revisions on disk.

For manipulation, LDAP is the least we should support. We might want to augment that with something like JXPath though. The downside of these technologies is that they provide no compile time safety. Debugging queries can only be done at runtime, which is something we should try to avoid whenever possible.

Let's start by doing access control on the client. If we really run into problems we can always move that code to the server.

Conclusion

  • Use XStream as a repository implementation.
  • Start with LDAP or JXPath for querying.
  • No labels