Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

IDIEP-42
Author
Sponsor
Created

 

Status

Status
colourGreen
titleACTIVECOMPLETED


Table of Contents

Motivation

Currently, compute grid functionality is not supported by thin client protocol. 

Description

We can start implementing compute grid functionality with an ability to run already deployed to grid compute tasks by task name (as a first step). In this case, to run some custom task users should deploy class with this task to server manually. The same functionality already implemented for binary-rest clients (GridClientCompute). This IEP describes steps to achieve such a goal and doesn't cover any class deployment functionality or other compute grid functionality (ability to run Callable, Runnable or other jobs).  

Currently, there are some limitations in the thin client protocol exists that prevents to implement the execution of long-running async tasks. The protocol allows only one-way requests (from client to server), but to inform the client about task completion effectively we need some kind of server-initiated messages to the client.

Protocol changes

Notifications

The new ability should be added to notify clients from the server-side. Notification - it's a one-way message from server to client. Notifications format should be compatible with existing response messages format (from server to client), to make it possible to distinguish notifications and responses on the client-side. 

...

Notification
longResource ID (task ID, continuous query ID, etc)
short

Message flags. Bitwise OR operation of following options:

0x0001 Error flag

0x0004 Notification flag (should always be set for notifications)

shortNotification operation code
intError code (if error flag is set)
stringError message (if error flag is set)
...Notification payload (if error flag isn't set)

Operation codes

New request type should be added to start a new task:

...

NameCode
OP_RESOURCE_CLOSE0

OP_COMPUTE_TASK_EXECUTE message format

Request
int

Count of nodes N selected to compute task. If this value is 0, no server nodes should be explicitly listed in the message, but all server cluster nodes should be selected for task execution. 

UUID

Node ID #1

...
UUIDNode ID #N
byte

Task flags. Combination of:

0x01 No-failover flag

0x02 No result caching flag

0x04 Keep binary flag

longTask timeout
stringTask name
objectTask argument

...

Response
longUnique started task ID (resource ID).

OP_COMPUTE_TASK_FINISHED message format

Notification for successfully executed task
long

Task ID (resource ID).

short

Notification flag (0x0004)

shortOP_COMPUTE_TASK_FINISHED (6001)
objectTask result

...

Notification for the failed task
long

Task ID (resource ID).

short

Notification flag | Error flag (0x0005)

shortOP_COMPUTE_TASK_FINISHED (6001)
intError code
stringError message

Overall workflow

Proposed task execution workflow:

  1. Client sends OP_COMPUTE_TASK_EXECUTE request and gets task ID as a response if task was successfully registered.
  2. Client has the ability to cancel the task using OP_RESOURCE_CLOSE request and passing task ID as a resource ID to the server.
  3. Server notifies the client with OP_COMPUTE_TASK_FINISHED message when the task is completed (successfully, unsuccessfully or if it was canceled by client). Notification should be sent eventually for each task which was successfully registered by OP_COMPUTE_TASK_EXECUTE request.

Feature masks

Currently, protocol versions are used to support backward-compatibility. Sometimes it's not convenient since to support some feature client should increase protocol version and support all other features introduced in all the previous versions of the protocol. This problem can be solved by using feature masks in the new protocol version. Clients and servers should inform each other on handshake about features that they supported.

...

So, bit 0 should be set on features mask if the client supports this feature.

Client-side API (java thin client)

All task-related operations should be started using a new ClientCompute interface. To get an instance of this interface new methods should be added to IgniteClient interface:

...

Async execution methods should return Future. Using this Future users can get task results or cancel the task.

Risks and Assumptions

If blocking IO is used (as in java thin client, for example) it's impossible to implement async operations without dedicated for each channel thread to process incoming messages. So, the count of threads on the client-side will be increased and can be raised dramatically if partition awareness functionality with a lot of server connections is used.

Discussion Links

http://apache-ignite-developers.2346864.n4.nabble.com/Thin-client-compute-support-td44405.html

Reference Links

// Links to various reference documents, if applicable.

Tickets

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
maximumIssues20
jqlQueryproject = Ignite AND labels IN (iep-42) ORDER BY status
serverId5aa69414-a9e9-3523-82ec-879b028fb15b