ID | IEP-17 |
Author | Denis Mekhanikov |
Sponsor | |
Created | |
Status | DRAFT |
Current service deployment procedure depends on an internal replicated cache. Each service deployment is a distributed transaction on this cache. This procedure proved to be deadlock-prone on unstable topology.
Also current implementation doesn't imply passing service deployment results to the deploying node. IgniteServices#deploy*
methods return even before Service#init()
start execution. So there is no way to know, whether a service is successfully deployed, or not. Even if an exception is thrown from Service#init()
method, then deploying side will consider the service successfully deployed anyway.
It is also impossible to have data-free server nodes, that are only responsible for running services and compute tasks, because the system cache is always present on all server nodes.
Currently when service implementation or configuration changes, you can't make existing instances be redeployed without manual undeployment. GridServiceProcessor
has access to the serialized representation of services only, so it can't tell, if anything have changed since previous deployment.
This section contains a description of the proposed service deployment protocol.
To make service deployment process more reliable on unstable topology and to avoid stuck deployments, that are possible in current architecture, service deployment should be based on custom discovery messages distribution.
Deployment starts with sending of a custom discovery event, that notifies all nodes in the cluster about the ongoing deployment. This message contains serialized service instance and its configuration. It is delivered to the coordinator node first, that calculates the service deployment assignments and adds this information to the message. During the following round-trip of this message, nodes save information about service deployment assignments to some local storage, and the ones, that were chosen to deploy the services, do it asynchronously in a dedicated thread pool.
Once the node finishes the deployment procedure and Service#init()
method execution, it connects to the coordinator, using the communication SPI, and sends the deployment result to it, i.e. either acknowledgement about successful deployment, or a serialized exception.
Once all deployment results are collected, coordinator sends another discovery message, notifying all nodes about successful deployment. This is the moment, when deployment futures are completed and the control is returned from IgniteServices#deploy*
methods. Also Service#execute()
method starts its work on successful deployment message arrival.
There are three types of errors, that should be handled correctly.
TBD
These changes will break compatibility with previous versions of Apache Ignite completely.
Service grid redesign: http://apache-ignite-developers.2346864.n4.nabble.com/Service-grid-redesign-td28521.html
Service versioning: http://apache-ignite-developers.2346864.n4.nabble.com/Service-versioning-td20858.html