...
Ignite distributes table data across cluster nodes. Every row belongs to a specific node. Currently, clients are not aware of this distribution, which may result in additional network calls. For example, client calls server A to read a row, server A has to call server B where the row is stored.
In an optimal scenario, the client knows that the row is stored on the server B and makes a direct call there.
Description
// Provide the design of the solution.
Risks and Assumptions
// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.
Discussion Links
// Links to discussions on the devlist, if applicable.
Reference Links
...
Currently, clients can already establish connections to multiple server nodes. Handhsake response includes node id and name.
Update client implementation:
- Retrieve and maintain up-to-date partition assignment - an array of node ids, where Nth element is the leader node ID for partition N.
- Use HashCalculator to compute colocation key hash (see Row#colocationHash).
- Calculate partition number as rowKeyHash % partitionCount.
- Get node id from partition assignment (p1).
- If a connection to the resulting node exists, perform a direct call. Otherwise, use default connection.
Exceptions:
- Transaction belongs to a specific node and client connection. When a non-null transaction is provided by the user, partition awareness logic is skipped.
Protocol Changes
1. Add PARTITION_ASSIGNMENT_GET operation
Response |
---|
array | Array of node ids, where array index is the partition number |
.
2. Update standard response message to include flags
Response |
int | Type = 0 |
int | Request id |
int | Flags |
int | Error code (0 for success) |
string | Error message (when error code is not 0) |
... | Operation-specific data |
3. Include colocation flag in SCHEMAS_GET response.
Tracking Assignment Changes
There are three potential ways to keep partition assignment up-to-date on the client:
- Response flag. All server responses include flags field, and server sets a flag when the assignment has changed since the last response. It is up to the client to retrieve updated assignment when needed. This mechanism is used in Ignite 2.x.
Pros: Low overhead, no extra network traffic.
Cons: Unlike Ignite 2.x, there is no concept of TopologyVersion in Ignite 3. So when there are multiple server connections, a notification for the same assignment update will come from all of them, and it is not possible to tell whether it was the same update or a new one. This may cause unnecessary assignment update requests. A workaround is to use only one connection to track assignment changes. Even if there is no activity on this connection, heartbeats (IEP-83) will trigger the update.
- Server → client notification. As soon as assignment changes, server sends a message to all clients.
Pros: Immediate update for all clients.
Cons: Increased network traffic and server load. Some clients may not need the update at all (not all APIs require this).
- PrimaryReplicaMissException (suggested in
Jira |
---|
server | ASF JIRA |
---|
serverId | 5aa69414-a9e9-3523-82ec-879b028fb15b |
---|
key | IGNITE-17394 |
---|
|
comments).
Pros: No protocol changes.
Cons: Retry is required on replica miss (complicated & inefficient). Using exceptions for control flow.
The first approach (response flag) is battle tested and seems to be the most optimal.
Discussion Links
Tickets
Jira |
---|
server | ASF JIRA |
---|
columnIds | issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution |
---|
columns | key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution |
---|
maximumIssues | 20 |
---|
jqlQuery | labels=iep-95 |
---|
serverId | 5aa69414-a9e9-3523-82ec-879b028fb15b |
---|
|