Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

IDIEP-44
Author
Sponsor
Created

  

Status

Status
colourGreen
titleCOMPLETED


Table of Contents

Motivation

Thin clients connect to one or more known servers. The set of endpoints is defined in ClientConfiguration and can not change after client start. However, Ignite clusters can be dynamic: nodes start and stop, IP addresses change. This is especially true in cloud environments, Kubernetes, and so on. In particular, Best Effort Affinity requires connections to all nodes to function efficiently.

Thin clients should be able to discover all server nodes automatically when connected to any of them, and maintain an up to date list of servers at all times. This behavior should be enabled when ClientConfiguration.PartitionAwarenessEnabled property is true.

Description

Clients can track topology changes: IEP-23 Best Effort Affinity introduced a response header change that sends topology version whenever it changes.

All we need is a new client operation to retrieve server node endpoints - IP:port combinations that every server listens to.

OP_CLUSTER_GROUP_GET_NODE_ENDPOINTS = 5102

Request
longstartTopologyVersion
longendTopologyVersion

...

Response
longtopologyVersion (actual topology version in this response)
intJoined node count
JoinedNode * count


JoinedNode
UUIDNode id
intClient listener port
intAddress count
String * count

IP or host name


intDisconnected node count
UUID * countDisconnected node ids

Server must return endpoints according to IgniteConfiguration.Localhost setting (IgniteConfiguration.Localhost defines network interfaces for Ignite to bind to).

Client Public API Changes

None.

When ClientConfiguration.PartitionAwarenessEnabled is true, discovery is enabled as well.

Client Logic

  1. Connect to one or more endpoints from ClientConfiguration
  2. Perform OP_CLUSTER_GROUP_GET_NODE_ENDPOINTS(-1, -1) to retrieve current topology.
  3. Connect to every server from (2) that is not yet connected.
    1. Use node UUID to make sure we don't connect to the same node twice
    2. Verify that node UUID from handshake is equal to node UUID from OP_CLUSTER_GROUP_GET_NODE_ENDPOINTS response: when client and server are in different subnets, there can be a mismatch. The node can even belong to a different cluster. Drop the connection in case of a mismatch. 
  4. For every response to any operation, compare topology version from the header to the topology version from (2). When it changes, request updated topology with OP_CLUSTER_GROUP_GET_NODE_ENDPOINTS(oldVer, newVer).
  5. Connect to newly discovered servers, remove connections to removed servers
  6. GOTO (4)

Risks and Assumptions

We assume that clients and servers share a common subnet. Other use cases are not supported.

Discussion Links

http://apache-ignite-developers.2346864.n4.nabble.com/IEP-44-Thin-Client-Discovery-td47129.html

Tickets

Jira
serverASF JIRA
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,customfield_12311032,customfield_12311037,customfield_12311022,customfield_12311027,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,Priority,Priority,Priority,Priority,priority,status,resolution
maximumIssues20
jqlQueryproject = Ignite AND labels IN (iep-44)
serverId5aa69414-a9e9-3523-82ec-879b028fb15bkeyIGNITE-12932