THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
Audience: All Cassandra Users and Developers
User Impact: Improved LWT performance, particularly in WAN or contended scenarios
Motivation
- LWTs suffer from poor performance, particularly in a WAN setting, and particularly under contention.
- LWTs have never guaranteed linearizability across range movements, which is a significant problem for a mechanism intended to offer strong consistency.
- CASSANDRA-12126 has introduced significant performance regressions in order to resolve long-standing correctness issues. This may result in users being unable to use LWTs where they could previously, or else having to accept a poorly-documented correctness trade-off in order to keep the lights on.
This work aims to address these shortcomings with various improvements to the performance of our Paxos implementation, without fundamentally altering its behaviour.
Goals
- Durable writes in no more than two round-trips when uncontended
- Linearizable reads in no more than one round-trip when uncontended
- Reduced contention
- Linearizability across range movements
Description of Approach
Paxos Repair
We will introduce a new repair mechanism, that can be run with or without regular repair. This mechanism will:
...
- Introduce a dedicated TimeUUID class to prevent mistakes mixing ballot with sstable timestamps
Upgrade / Migration
- Upgrade will likely be initially optional, and the mechanism TBD. At minimum there will be JMX endpoints to enable/disable the new mechanisms.
Test Plan
- The Cluster Simulation CEP introduces significant testing for the correctness of this system, which will be expanded to ensure coverage of new functionality
- Extensive real-world testing will be conducted on synthetic and live traffic
- Unit tests for the new subsystems will also accompany the patch