IDIEP-95
Author
Sponsor
Created

  

Status
DRAFT


Motivation

Ignite distributes table data across cluster nodes. Every row belongs to a specific node. Currently, clients are not aware of this distribution, which may result in additional network calls. For example, client calls server A to read a row, server A has to call server B where the row is stored.

In an optimal scenario, the client knows that the row is stored on the server B and makes a direct call there.

Description

Currently, clients can already establish connections to multiple server nodes. Handhsake response includes node id and name.

Update client implementation:

  1. Retrieve and maintain up-to-date partition assignment - an array of node ids, where Nth element is the leader node ID for partition N.
  2. Use HashCalculator to compute colocation key hash (see Row#colocationHash).
  3. Calculate partition number as rowKeyHash % partitionCount.
  4. Get node id from partition assignment (p1).
  5. If a connection to the resulting node exists, perform a direct call. Otherwise, use default connection.

Exceptions:

  • Transaction belongs to a specific node and client connection. When a non-null transaction is provided by the user, partition awareness logic is skipped.

Protocol Changes

1. Add PARTITION_ASSIGNMENT_GET operation

Request
UUIDtable ID
Response
arrayArray of node ids, where array index is the partition number

.

2. Update standard response message to include flags

Response
intType = 0
intRequest id
intFlags
intError code (0 for success)
stringError message (when error code is not 0)
...Operation-specific data


3. Include colocation flag in SCHEMAS_GET response.

Tracking Assignment Changes

There are three potential ways to keep partition assignment up-to-date on the client:

  1. Response flag. All server responses include flags field, and server sets a flag when the assignment has changed since the last response. It is up to the client to retrieve updated assignment when needed. This mechanism is used in Ignite 2.x. 
    Pros: Low overhead, no extra network traffic.
    Cons: Unlike Ignite 2.x, there is no concept of TopologyVersion in Ignite 3. So when there are multiple server connections, a notification for the same assignment update will come from all of them, and it is not possible to tell whether it was the same update or a new one. This may cause unnecessary assignment update requests. A workaround is to use only one connection to track assignment changes. Even if there is no activity on this connection, heartbeats (IEP-83) will trigger the update.

  2. Server → client notification. As soon as assignment changes, server sends a message to all clients.
    Pros: Immediate update for all clients.
    Cons: Increased network traffic and server load. Some clients may not need the update at all (not all APIs require this).

  3. PrimaryReplicaMissException (suggested in Unable to render Jira issues macro, execution error. comments).
    Pros: No protocol changes.
    Cons: Retry is required on replica miss (complicated & inefficient). Using exceptions for control flow.

The first approach (response flag) is battle tested and seems to be the most optimal. 

Discussion Links

Tickets

key summary type created updated due assignee reporter priority status resolution

JQL and issue key arguments for this macro require at least one Jira application link to be configured

  • No labels