ID | IEP-42 |
Author | |
Sponsor | |
Created |
|
Status | DRAFT |
Currently, compute grid functionality is not supported by thin client protocol.
We can start implementing compute grid functionality with an ability to run already deployed to grid compute tasks by task name (as a first step). In this case, to run some custom task users should deploy class with this task to server manually. The same functionality already implemented for binary-rest clients (GridClientCompute). This IEP describes steps to achieve such a goal and doesn't cover any class deployment functionality or other compute grid functionality (ability to run Callable, Runnable or other jobs).
Currently, there are some limitations in the thin client protocol exists that prevents to implement the execution of long-running async tasks. The protocol allows only one-way requests (from client to server), but to inform the client about task completion effectively we need some kind of server-initiated messages to the client.
The new ability should be added to notify clients from the server-side. Notification - it's a one-way message from server to client. Notifications format should be compatible with existing response messages format (from server to client), to make it possible to distinguish notifications and responses on the client-side.
Proposed notification format:
Notification | |
long | Resource ID (task ID, continuous query ID, etc) |
short | Message flags. Bitwise OR operation of following options: 0x0001 Error flag 0x0004 Notification flag (should always be set for notifications) |
short | Notification operation code |
int | Error code (if error flag is set) |
string | Error message (if error flag is set) |
... | Notification payload (if error flag isn't set) |
New request type should be added to start a new task:
Name | Code |
---|---|
OP_COMPUTE_TASK_EXECUTE | 6000 |
To notify about task completion following notification code should be used:
Name | Code |
---|---|
OP_COMPUTE_TASK_FINISHED | 6001 |
To cancel the task existing request type should be used:
Name | Code |
---|---|
OP_RESOURCE_CLOSE | 0 |
Request | |
---|---|
int | Count of nodes N selected to compute task. If this value is 0, no server nodes should be explicitly listed in the message, but all server cluster nodes should be selected for task execution. |
UUID | Node ID #1 |
... | |
UUID | Node ID #N |
byte | Task flags. Combination of: 0x01 No-failover flag 0x02 No result caching flag |
long | Task timeout |
string | Task name |
object | Task argument |
Response | |
---|---|
long | Unique started task ID (resource ID). |
Notification for successfully executed task | |
---|---|
long | Task ID (resource ID). |
short | Notification flag (0x0004) |
short | OP_COMPUTE_TASK_FINISHED (6001) |
object | Task result |
Notification for the failed task | |
---|---|
long | Task ID (resource ID). |
short | Notification flag | Error flag (0x0005) |
short | OP_COMPUTE_TASK_FINISHED (6001) |
int | Error code |
string | Error message |
Proposed task execution workflow:
Currently, protocol versions are used to support backward-compatibility. Sometimes it's not convenient since to support some feature client should increase protocol version and support all other features introduced in all the previous versions of the protocol. This problem can be solved by using feature masks. Clients and servers should inform each other on handshake about features that they supported.
Proposed changes to handshake request and response:
Request | ||
byte | Handshake code, always 1 | |
short | Version major | |
short | Version minor | |
short | Version patch | |
byte | Client code, always 2 | |
int | Client features mask array length | Since version 2.0.0 (new field) |
byte[] | Client features mask array | Since version 2.0.0 (new field) |
Map | User attributes | Since version 1.7.0 |
String | Username (optional) | Since version 1.1.0 |
String | Password (optional) | Since version 1.1.0 |
Response (success) | ||
byte | Success flag, 1 | |
int | Server features mask array length | Since version 2.0.0 (new field) |
byte[] | Server features mask array | Since version 2.0.0 (new field) |
UUID | Node id | Since version 1.4.0 |
Features mask - it's an array, where some bit is set if the feature with the corresponding id is supported.
For compute tasks execution following future should be introduced:
Feature | id |
---|---|
EXECUTE_TASK_BY_NAME | 0 |
So, bit 0 should be set on features mask if the client supports this feature.
All task-related operations should be started using a new ClientCompute interface. To get an instance of this interface new methods should be added to IgniteClient interface:
public ClientCompute compute(ClientClusterGroup grp); public ClientCluster cluster();
Proposed ClientCompute interface:
public interface ClientCompute { public ClientClusterGroup clusterGroup(); // Sync and async task execution methods. public <T, R> R execute(String taskName, @Nullable T arg) throws ClientException, InterruptedException; public <T, R> ClientFuture<R> executeAsync(String taskName, @Nullable T arg) throws ClientException; // ClientCompute modificators. public ClientCompute withTimeout(long timeout); public ClientCompute withNoFailover(); public ClientCompute withNoResultCache(); }
Async execution methods should return Future. Using this Future users can get task results or cancel the task. It's proposed to create a new Future interface for java thin client because get() method should be marked as throwing ClientException, but existing IgniteFuture throws IgniteException instead. The same approach is used for binary-rest client. Proposed new ClientFuture interface:
public interface ClientFuture<R> { public R get() throws ClientException, CancellationException, InterruptedException; public R get(long timeout, TimeUnit unit) throws ClientException, TimeoutException, InterruptedException; public boolean isDone(); public boolean cancel() throws ClientException; public boolean isCancelled(); }
If blocking IO is used (as in java thin client, for example) it's impossible to implement async operations without dedicated for each channel thread to process incoming messages. So, the count of threads on the client-side will be increased and can be raised dramatically if partition awareness functionality with a lot of server connections is used.
http://apache-ignite-developers.2346864.n4.nabble.com/Thin-client-compute-support-td44405.html
// Links to various reference documents, if applicable.