A page to capture thoughts and notes about the Tuscany SCA distributed runtime. Feel free to help in developing the ideas here. I've put the source for the diagrams on this page in my sandbox here
While the nature of a distributed runtime implies that more than one runtime of more than one type will be involved in a running SCA application, it seems sensible to work with distributing one type of runtime (java) before branching out.
Distributed SCA Runtimes - Notes
The assembly model specification 2 deals briefly with distributed runtimes in its discussion of SCA Domains
"An SCA Domain represents a complete runtime configuration, potentially distributed over a series of interconnected runtime nodes."
The assembly spec, however, is not prescriptive about how an SCA Domain should be mapped and supported across multiple runtime nodes. Here I believe the term runtime node (or just node) is used to describe a process running an SCA runtime in which components can be run, e.g. the Java or C++ runtimes that Tuscany is developing.
Furthermore, the SCA specifications do not talk about the form of repository in which the metadata of the SCA Domain is held - although clearly the metadata must live somewhere. The Domain repository could take a variety of forms, from a simple shared set of directories to a sophisticated distributed database system. SCA runtimes should be able to interact with a variety of Domain repositories.
Motivation
Some use cases that provide motivation for distributing a domain
- Represent the widely distributed nature of a typical SOA so that SCA presents a cross enterprise description of assembled components
- Policy matching where components require particular resources and hence particular, and separate, nodes
- HA/Load balancing/Performance scenarios where a single component appears on multiple nodes
- Load balancing/Performance scenarios where domain is spread across multiple nodes (same as 1 & 2 I believe)
- Dynamic wiring/Registry based service location, i.e. the SCA binding is called upon to automatically locate services based on registry entries.(overlaps with all of the above)
Terminology
SCADomain, Composite, Component, Service, Reference - as described in the SCA specifications. Note that a Domain may span multiple runtime nodes. A Composite may also span multiple runtime nodes.
Distributed Domain
An SCA Domain (complete runtime configuration) that is "distributed over a series of interconnected runtime nodes".
Runtime
The logical container for one or more SCA Domains containing components. A runtime groups together one or more (distributed) runtime nodes.
Node
Provides an environment inside which SCA component instances execute. It's an operating system process, separate from other Nodes. Its form may be as simple as a single Java VM or it may take the more scalable and reliable form, such as a compute cluster.
Each node must be capable of supporting at least:
- one implementation type
- one binding type (which may be restricted to binding.sca)
A runtime node must be able to expose the service endpoints required by the components it runs. It must be able to support the client technology for the references of associated components.
Domain Node
The part of a Distributed Domain that runs on a Node.
Component Instance
The running component that services requests. A single component definition in a SCDL file may give rise to one or more component instances depending on how the component is scoped (using the @Scope annotation).
Scoping The Distribution Problem
There are many existing technologies that deal with managing compute nodes and job scheduling. So it's probably safe to start by ignoring the issue of how the system picks processors on which runtime nodes will run (1). So the runtime management of the nodes themselves is out of scope.
There are also many technologies that provide scalable, robust and/or high performance service hosting solutions. So we can also ignore the issue of how component instances are actually constructed as the runtime representation of components deployed to a runtime (3). For example if a JVM clustering solution is chosen to implement a node then we assume that local method calls within that cluster will be handled by the clustering technology and no special action is required. If higher level clustering technology is in operation then intergration with the runtime is required. In this case, where each node in the cluster runs part of the domain a component can be mapped to multiple nodes, the most natural integration point is the SCA binding which must interact with the clustering technology in order locate target component services.
So the initial area of consideration is how the components of a domain are associated with runtime nodes (2).
Cardinality
In the non-distributed case a single runtime node loads all contributions and runs all components.
In the distributed case, A Domain may span many nodes.
Each component must be associated (through specific configuration or some matching algorithm) with one or more nodes. If the same component appears in more than one node, the runtime is responsible for deciding how messages are routed to the correct node.
There is no restriction on the number of component instances a node can create, or indeed how these component instances are run. For example, all component instances could run in a single VM or be distributed across a number of VM's to provide improved performance or failover for example.
Some questions have been raised about cardinality
Should load balancing or HA type scenarios be able to be described in the topology by allowing components to be assinged to more than one node?
Answer: Yes. However, this may well be best handled by a "layered runtime" approach, where a whole cluster of nodes is presented to the rest of the distributed runtime as a single node.
Should a runtime be able to run more than one domain?
Answer: Yes. Multiple Domains can run on the same runtime. It is up to the runtime implementation to ensure that appropriate partitioning is achieved, since SCA Domains are intended to be isolated (for example, a reference in one domain cannot directly reference a service in another domain through it SCA component & service names).
Scenario - Distributed Calculator
Scenario - Web Application Cluster
A more specific scenario where a distributed domain is used to support an application within a web application configured as a cluster.
Managing The Distributed Domain
Th logical view of how the different parts of the solution communicate is.
Messages - the application messages that flow between configured components. Messages will flow over bindings described excplicitly in the assembly model or across the default binding used when no explicit binding is specified.
Configuration - In the disitrubted domain configuration is shared across the nodes with which the domain is associated. This includes information about, contributed resources, running components and their endpoints and domain configuration items such as base URLs.
Events - as the domian runs interesting events will occur, for example, a node fails and is restarted meaning that a set of endpoints change.
Components Of The Solution
Based on the calculator scenario can imagine the following.
Interfaces
Node
- start(nodeUri)
- stop()
- joinDomain(domainUri)
- domainNodeConfigurationChange(domainUri)
ServiceDiscovery
- findServiceEndpoint(domainUri, serviceName) url
- registerServiceEndpoint(domainUri, serviceName, url)
DomainNode
DomainNode
- createDomainNode(domainUri, nodeUri)
- startDomainNode(domainUri, nodeUri)
- stopDomainNode(domainUri, nodeUri)
BaseUriMap
- setBaseUri(domainUri, nodeUri, protocol, uri)
- getBaseUri(domainUri, nodeUri, protocol) uri
ComponentMap
- addComponent(componentName, domainUri, nodeUri)
- removeComponent(componentName, domainUri, nodeUri)
- getComponents(domainUri, nodeUri) list of component names
ContributionManager
- addContribution(domainUri, contributionUri)
- removeContribution(domainUri, contributionUri)
ComponentManager - a version is already defined in host embedded
- startComponent(domainUri, componentUri)
- stopComponent(domainUri, componentUri)
Distributed Domain
- registerNode(domainUri, nodeUri, nodeManagementUrl)
- getDomainNodeConfiguration(domainUri, nodeUri) url
Event (I expect there is some specified interface we can use here)
- logEvent(domainUri,event)
- getEventLog(domainUri)
Walkthrough
1. Running a node
- Run a node exe giving it a node uri and a domain to join.
- The node exe will embed and start a Tuscany runtime.
- Node service is exposed
- Node will discovery where the distributed domain is running
- in a file based scenario configuration is available locally
- discovery can be hardcoded if required.
- Create and start a domain node
2. Running the distributed domain
- Start the distributed domain exe giving
- Note the distributed domain may only exist in configuration files on disc and in this case no separate exe is required
- Gather together the domain configuration
- base uris
- Added contributions
- Components added to nodes
- Nodes will join the domain as they are started
- Provide domain node configuration to a node on request
- If the configuration changes notify each (affected) node
3. Node initial configuration
- Requests configuration for this domain node
- Configuration is supplied in the form of
- base uris
- contributions to load
- components to activate
- Contributions are loaded
- Gives rise to endpoints being registered with the distributed domain
4. Starting a domain node
- Domain node is activated
- currently gives rise to all domain components starting
5. Starting a component
- Start a named component
6. Stopping a component
- Stop a named component
7. Stopping a domain node
- Domain node is stopped
- all running components are stopped
8. Updating node configuration
- Distributed domain notifies all (affected) nodes
- Node retrieves domain node configuration (updates)
- New/Updated/Removed controbution
- Added/Remove components
- Currently incremental domain updated are not fully supported so will have to go with wholesale reconfiguration
- Domain node is stopped
- Contributions are reprocessed
- Domain is restarted
9. Choosing a component instance
Assigning components to nodes defines the endpoint for a components services. The distributed domain uses this information to create default bindings for the cross node wires. If a component is assigned to multiple nodes then the runtime is responsible for selecting the appropriate node based on, scope, conversational status of target component and also non specified goals such as load balancing
10. Events and Stats
- The hierarchy of components in the distributed domain
- The components running on a node
- Events/logs for the distributed domain or for a node
11. Node failure
- A failed node is restarted and reconfigures itself from the distributed domain
- and inflight requests are lost
- any ongoing conversations are lost unless they have been persisted by the runtime
- endpoints are re-registered when node restarts
- A failed node can be restarted in a different place
- Base uri configuration must be adjusted to take account of new location.
- endpoints are re-registered when node restarts
12. Distributed domain failure
- Nodes remain running in isolation.
- periodically trying to rediscover new distributed domain
- Restart distributed domain
- nodes should eventally rediscover it
SCA Binding
The SCABinding is the default binding used within an SCA assembly. In the runtime in a single VM case it implies local connections. In the distributed runtime case it hides all of the complexity of ensuring that nodes wired between runtimes are able to communicate.
When a message oriented binding is used here we benefit from the abstract nature of the endpoints, I.e queues can be created given runtimeId/ServiceID and messages can be targetted at these queues without knowledge of where the message consumers are physically.
Whene a point to point protocol is used a physical endpoint is required. So a registry of endpoints to SCA bound service is required to allow the SCA binding to find the appropriate target. This registry can either be static, i.e. derived from the base urls given in the domain topology configuration, or dynamic in nature, i.e. set up at runtime.
Within the same domain/runtime multiple technologies may be required to implement the SCA binding as messages pass between different runtime node implementations.
Modelling The Distributed Domain
Using information from the SCA Assembly specification and the implied requirements of a distribute runtime we can determine what data is required to configure and control the distributed SCADomain.
SCADomain Name (DomainA) BaseURI Domain Level Composite Component (ComponentA) implementation composite Service Reference Installed Contributions Initial Package Contribution (file system, jar, zip etc) URI (ContributionA) /META-INF/ sca-contribution.xml deployable (composite QName) import (namespace, location) export (namespace) sca-contribution-generated.xml deployable (composite QName) import (namespace, location) export (namespace) deployables *.composite *.composite URI Component (ComponentA) Service Reference Other Resources URI Dependent Contributions Contribution snapshot Deployment-time Composites *.composite Over and above the contributed information we need to associate components with runtime nodes. Runtime name (runtimeA) Node name (nodeeA) DomainA scheme http://localhost:8080/acbd scheme https://localhost:442/abcd ComponentA
We know how SCDL is used to represent the application composites. We can view the runtime node configuration as a new set of components, interfaces, services and references. In SCA terms we can consider that each node implements a system composite that provides the service interfaces required to manage the node, for example.
<composite xmlns="http://www.osoa.org/xmlns/sca/1.0" name="nodeA"> <component name="ComponentRegistry"> <implementation.java class="org.apache.tuscany.sca.distributed.node.impl.DefaultComponentRegistry"/> </component> </composite>
Having this meand that we can expose out local component registry using any bindings that Tuscany supports. Imagine that out component registry has an interface that allows out to
getComponentNode
setComponentNode
etc.
Then we might choose to initialise the registry with the follwoing type of information.
<runtime> <node name="nodeA"> <schema name="http" baseURL="http://localhost:80" /> <schema name="https" baseURL="https://localhost:443" /> <component name="CalculatorServiceComponent" /> </node> <node name="nodeB"> <schema name="http" baseURL="http://localhost:81"/> <schema name="https" baseURL="https://localhost:444" /> <component name="AddServiceComponent"/> </node> <node name="nodeC"> <schema name="http" baseURL="http://localhost:81"/> <schema name="https" baseURL="https://localhost:444" /> <component name="SubtractServiceComponent"/> </node> </runtime>
Of course we can read this configuration locally form a file, have it delivered via a service interface or retrieve it via a reference.
To Do
SCA Binding
Currently the code uses JMS to implement the default remote SCA binding. The remote SCA binding is used when the system finds that two components that are wired together locally are deployed to separate Nodes. As an alternative it would be good to support web services here also and have this fit in with the new SCA binding mechanism that Simon Nash has been working on.
To make a web services SCA binding work we need an EndpointLookup interface so that components out there in the distributed domain can locate other components that they are wired to.
Node
Currently each node runs in isolation and starts a local SCA domain configured from .topology and .composite files. Now implement Node interfaces so that this information can be provided remotely and so that the node can expose remotely accessible management interfaces, for example.
The domain management interface Ant has recently been added that may help us shape this. Also Sebastien's work to allow local domains to be modified more dynamically should help make this work.
Domain Node
Provide the link between the node and the domain node in order to deliver configuration updates, control messages and events
Distributed Domain
Provide some centralized control by implementing the distributed domain interfaces
WebApps
It qould be useful to have some simple query application in the tone of what is currently in distribution/webapp
References
1 http://www.mail-archive.com/tuscany-dev%40ws.apache.org/msg16971.html
2 http://www.osoa.org/display/Main/Service+Component+Architecture+Specifications
3 http://www.mail-archive.com/tuscany-dev@ws.apache.org/msg18613.html