Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We will deprecate system.paxos TTLs, and instead expunge records that are older than the most recent paxos repair for any given range/table.

This mechanism alone will permit safely returning success prior to performing the COMMIT step of our Paxos implementation. Users will still have to opt-in to this behaviour by providing commit consistency of ANY, or ONE, or perhaps LOCAL_QUORUM, depending on their preference. However this can be recommended as safe and preferred once this mechanism is in place, taking us from eight to six message delaysfour to three round-trips.

Paxos Optimisations
Several optimisations to our paxos implementation will be introduced, including

  • Combine promise+read before a proposal: if the proposal is successful, the read will have been linearized along with the write, taking us to four message delaystwo round-trips.
  • Optimistic reads: if a majority of promises witnessed consistent state when promising and performing their read, the majority read can be returned to the client without waiting to issue an empty proposal. This takes us to two message delays on one round-trip on read.
  • Preventing read/read competition: promises will be issued separately for reads and writes, with read promises invalidating write promises, and write promises invalidating read promises, but read promises will not invalidate each other, or prevent the above optimistic read optimisation.
  • Bounding re-proposals: incomplete commands that are re-proposed will not continue to be re-proposed if the original command has been committed (specifically, we track separately the ballot of the original proposal and the re-proposal, so that if the original proposal reaches the commit state as part of the original proposal, or any re-proposal, all re-proposals can instead go straight to commit).
  • Coordinators will not self-compete for operations on the same partition
  • Coordinators will cache PaxosState to limit dependence on performance of system.paxos

...