ID | IEP-95 |
Author | |
Sponsor | |
Created | |
Status | |
Motivation
// Define the problem to be solved.
Description
// Provide the design of the solution.
Risks and Assumptions
// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.
Discussion Links
// Links to discussions on the devlist, if applicable.
Reference Links
...
Ignite distributes table data across cluster nodes. Every row belongs to a specific node. Currently, clients are not aware of this distribution, which may result in additional network calls. For example, client calls server A to read a row, server A has to call server B where the row is stored.
In an optimal scenario, the client knows that the row is stored on the server B and makes a direct call there.
Description
Currently, clients can already establish connections to multiple server nodes. Handhsake response includes node id and name.
Update client implementation:
- Retrieve and maintain up-to-date partition assignment - an array of node ids, where Nth element is the leader node ID for partition N.
- Use HashCalculator to compute colocation key hash (see Row#colocationHash).
- Calculate partition number as rowKeyHash % partitionCount.
- Get node id from partition assignment (p1).
- If a connection to the resulting node exists, perform a direct call. Otherwise, use default connection.
Exceptions:
- Transaction belongs to a specific node and client connection. When a non-null transaction is provided by the user, partition awareness logic is skipped.
Protocol Changes
1. Add PARTITION_ASSIGNMENT_GET operation
Response |
---|
array | Array of node ids, where array index is the partition number |
.
2. Update standard response message to include flags
Response |
int | Type = 0 |
int | Request id |
int | Flags |
int | Error code (0 for success) |
string | Error message (when error code is not 0) |
... | Operation-specific data |
3. Include colocation flag in SCHEMAS_GET response.
Tracking Assignment Changes
There are three potential ways to keep partition assignment up-to-date on the client:
- Response flag. All server responses include flags field, and server sets a flag when the assignment has changed since the last response. It is up to the client to retrieve updated assignment when needed. This mechanism is used in Ignite 2.x.
Pros: Low overhead, no extra network traffic.
Cons: Unlike Ignite 2.x, there is no concept of TopologyVersion in Ignite 3. So when there are multiple server connections, a notification for the same assignment update will come from all of them, and it is not possible to tell whether it was the same update or a new one. This may cause unnecessary assignment update requests. A workaround is to use only one connection to track assignment changes. Even if there is no activity on this connection, heartbeats (IEP-83) will trigger the update.
- Server → client notification. As soon as assignment changes, server sends a message to all clients.
Pros: Immediate update for all clients.
Cons: Increased network traffic and server load. Some clients may not need the update at all (not all APIs require this).
- PrimaryReplicaMissException (suggested in
Jira |
---|
server | ASF JIRA |
---|
serverId | 5aa69414-a9e9-3523-82ec-879b028fb15b |
---|
key | IGNITE-17394 |
---|
|
comments).
Pros: No protocol changes.
Cons: Retry is required on replica miss (complicated & inefficient). Using exceptions for control flow.
The first approach (response flag) is battle tested and seems to be the most optimal.
Discussion Links
Tickets
Jira |
---|
server | ASF JIRA |
---|
columnIds | issuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution |
---|
columns | key,summary,type,created,updated,due,assignee,reporter,priority,status,resolution |
---|
maximumIssues | 20 |
---|
jqlQuery | labels=iep-95 |
---|
serverId | 5aa69414-a9e9-3523-82ec-879b028fb15b |
---|
|