Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Recovery Protocol Changes

Since all the changes are written right after lock is aquired, all the participants are ready to commit changes after each successful data change.

We can use next rules for tx recovery on TX coordinator failure:

  1. On TX coordinator failure the oldest node from tx participants is elected as TX coordinator and checks other tx participants local states.
  2. If there is at least one participant having tx in ACTIVE or ABORTED state, whole transaction is marked as ABORTED and tx rolled back message is sent to MVCC coordinator.
  3. If all participants have tx in LOCALLY_COMMITTED state, whole transaction is marked as COMMITTED and tx committed message is sent to MVCC coordinator.
  4. If there is at least one lost partition from cacheId to partitions mapping, whole transaction is marked as ABORTED and tx rolled back message is sent to MVCC coordinator.
  5. A record with lost partitions from cacheId to partitions mapping cannot be deleted from TxLog.
  6. Rejoining node checks theyr local TxLog. If it has a tx in ACTIVE or LOCALLY_COMMITTED state it compares this transactions with MVCC coordinator. If there is no matched records it forcibly rebalances involved partitions to prevent inconsistence.

Read (getAll and SQL changes)