Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

After analysis, it is observed, that as client puts data is put in region, sequence Id only of last event is stored. connected server, that server will replicate data to other servers. If first server in chain of replicated servers is restarted, replication will be halted until server is declared dead.

Since clients read-timeout is lower then member-timeout, client will continue with insertion of data. And again, replication will be halted to replicated servers. At some moment, restarted member will be declared dead, and now we will have halted events  to put in replicated servers, and in parallel new events from client. If new event from client is replicated to other servers, before halted events, then in moment that halted events are signaled to replicated servers, they will assume that these are duplicate events (compare sequence Id of received event  to last stored sequence ID), and will not store this data.


In linked PR is created distributed test for reproduction of problemDue to that, if server is restarted, replicated data, while server is down, will be ignored, because sequence Id of replicated data is smaller, then last written.

Anti-Goals

Solution

Add logic, for storing of last N (configurable value) key - sequence Id pairs, instead of only last sequence Id.

...