You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

IDIEP-83
Author Pavel Tupitsyn 
Sponsor Pavel Tupitsyn 
Created

  

StatusDRAFT


Motivation

TCP connections can enter half-open state: seems to be alive, but any attempt to send data will fail. Long-living and mostly idle connections are especially susceptible to this behavior.

Retry mechanism (IEP-82 Thin Client Retry Policy) in thin client implementations partially mitigates the issue. However, not all operations are safe to retry, and reconnect affects performance.

To improve the connection stability and detect failures early we can add a keep-alive mechanism.

Description

Why not TCP keepalive

TCP has a built-in keepalive mechanism, but it has some disadvantages:

  • Optional (may not be present in some TCP stacks)
  • May not be handled well by some routers (RFC 1122, section 4.2.3.6)
  • Default timeout is too long (2 hours), and is problematic to adjust on SDK versions that are in use in Ignite (Java 8, .NET Standard 2.0), or hard to do right in some languages (Python, JS).

Because of that, some protocols implement keepalive logic on a higher level (SMB, TLS). More details: https://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html

Proposal

Add OP_HEARTBEAT to the protocol with an empty payload. Clients can send heartbeats at a configurable interval and receive responses to ensure that the connection is active.

Risks and Assumptions

// Describe project risks, such as API or binary compatibility issues, major protocol changes, etc.

Discussion Links

// Links to discussions on the devlist, if applicable.

Reference Links

// Links to various reference documents, if applicable.

Tickets

// Links or report with relevant JIRA tickets.

  • No labels