To be Reviewed By: 1 March 2022
Authors: Mario Ivanac
Status: Draft Draft | Discussion | Active | Dropped | Superseded
Superseded by: N/A
Related: N/A
Problem
For example, we have configured replicated region with 3 or more servers, and client is configured with read timeout set to value same or smaller than member timeout.
...
After analysis, it is observed, as client puts data in connected server, that server will replicate data to other servers. If first any server in chain of replicated servers is restarted, replication will be halted until server is declared dead.
Since clients read-timeout is lower then member-timeout, client will continue with insertion of data. And again, replication will be halted to replicated servers. At some moment, restarted member will be declared dead, and now we will have halted events to put in replicated servers, and in parallel new events from client. If new event from client is replicated to other servers, before halted events, then in moment that halted events are notified to replicated servers, they will assume that these are duplicate events (compare sequence Id of received event to last stored sequence ID), and will not store this data.
In linked PR is created you can find distributed test for reproduction of problem.
Anti-Goals
Solution
When replicating to other servers, current logic would first create connections toward all servers, and then replicate data over those connections. Creation of connections will last, until all connections are created, or destination server/s is/are declared dead.
New proposal is, when replicating data, first try to create connections to all destinations in one attempt. For all created connections we will replicate data. After this step is completed, we will try to create remaining connections to all unreachable destinations, until connections are established, or destination is declared dead. If connections are established we will replicate data.
Linked PR: https://github.com/apache/geode/pull/7381
Other solution explored that have been discarded is:
Add logic, for storing of last N (configurable value) key - sequence Id pairs, instead of only last sequence Id.
User will configure size of last stored "key - sequence Id" Map.
Default value is 0, feature disabled
Value greater than 0, feature activated (calculate according to formula = 3 x member-timeout / read-timeout)
PR with proposed solution: https://github.com/apache/geode/pull/7214
Changes and Additions to Public Interfaces
None
Performance Impact
None
Backwards Compatibility and Upgrade Path
No upgrade or backwards compatibility issues.
Prior Art
What would be the alternatives to the proposed solution? What would happen if we don’t solve the problem? Why should this proposal be preferred?
FAQ
Answers to questions you’ve commonly been asked after requesting comments for this proposal.
Errata
What are minor adjustments that had to be made to the proposal since it was approved?
...