The overarching goal of the 4.0 release is that Cassandra 4.0 should be at a state where major users would run it in production when it is cut. To gain this confidence there are various ongoing testing efforts involving correctness, performance, and ease of use. In this page we try to coordinate and identify blockers for subsystems before we can release 4.0
Tracking
We will track progress in Jira by tagging high level components with the 4.0-QA
Jira label. We as a community can see our progress via a simple Jira query:
Jira Query for Tracking Progress
For each component we strive to have shepherds and contributors involved. Shepherds should be committers or knowledgeable component owners and are responsible for driving their blocking tickets to completion and ensuring quality in their claimed area, while contributors have signed up to help verify that subsystem by running tests or contributing fixes. Shepherds also ideally help set testing standards and ensure that we meet a high standard of quality in their claimed area.
If you are interested in contributing to testing 4.0, please add your name as a contributor and get involved in the the tracking ticket, and dev list/IRC discussions involving that component.
Targeted Components / Subsystems
We've tried to collect some of the major components or subsystems that we want to ensure work properly towards having a great 4.0 release. If you think something is missing please add it. Better yet volunteer to contribute to testing it!
Internode Messaging
In 4.0 we're getting a new Netty based inter-node communication system (CASSANDRA-8457). As internode messaging is vital to the correctness and performance of the database we should make sure that all forms (TLS, compressed, low latency, high latency, etc ...) of internode messaging function correctly.
Shepherd: Jason Brown
Tracking Ticket: CASSANDRA-14746
Contributors: Vinay Chella, Jordan West, Dinesh Joshi, Joey Lynch, Sumanth Pasupuleti, Benedict Elliott Smith, Aleksey Yeshchenko
Local Read/Write Path
Testing in this area refers to the local read/write path (StorageProxy, ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still finding numerous bugs and issues with the 3.0 storage engine rewrite (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the local read/write path with techniques such as property-based testing, fuzzing (example), and a source audit.
Shepherd: Aleksey Yeshchenko
Tracking Ticket: TBD
Contributors: Sam Tunnicliffe, Blake Eggleston
Distributed Read/Write Path: Coordination, Replication, and Read Repair
Testing in this area focuses on non-node-local aspects of the read-write path: coordination, replication, read repair, etc.
Shepherd: TBD
Tracking Ticket: TBD
Contributors: Blake Eggleston
Repair
We aim for 4.0 to have the first fully functioning incremental repair solution (CASSANDRA-9143)! Furthermore we aim to verify that all types of repair: (full range, sub range, incremental) function as expected as well as ensuring community tools such as Reaper work.
Shepherd: Blake Eggleston
Tracking Ticket: TBD
Contributors: TBD
Compaction
Alongside the local and distributed read/write paths, we'll also want to validate compaction. CASSANDRA-6696 introduced substantial changes/improvements that require testing (esp. JBOD).
Shepherd: Marcus Eriksson
Tracking Ticket: TBD
Contributors: Jordan West
Metrics
In past releases we've unknowingly broken metrics integrations and introduced performance regressions in metrics collection and reporting. We strive in 4.0 to not do that. Metrics should work well!
Shepherd: TBD
Tracking Ticket: ?
Contributors: Romain Hardouin
Tooling: Bundled / First-Party
Test plans should cover bundled first-party tooling and CLIs such as nodetool, cqlsh, and new tools supporting full query and audit logging (CASSANDRA-13983, CASSANDRA-12151).
Shepherd: Sam Tunnicliffe
Tracking Ticket: TBD
Contributors: Add your name!
Tooling: External Ecosystem
Many users of Apache Cassandra employ open source tooling to automate Cassandra configuration, runtime management, and repair scheduling. Prior to release, we need to confirm that popular third-party tools such as Reaper, Priam, etc. function properly.
Shepherd: Sam Tunnicliffe
Tracking Ticket: TBD
Contributors: Add your name!
Test Frameworks, Tooling, Infrastructure / Automation
This area refers to contributions to test frameworks/tooling (e.g., dtests, QuickTheories, CASSANDRA-14821), and automation enabling those tools to be applied at scale (e.g., replay testing via Spark-based replay of captured FQL logs).
Shepherd: Jordan West
Tracking Ticket: TBD
Contributors: Add your name!
Cluster Setup and Maintenance
We want 4.0 to be easy for users to setup out of the box and just work. This means having low friction when users download the Cassandra package and start running it. For example, users should be able to easily configure and start new 4.0 clusters and have tokens distributed evenly. Another example is packaging, it should be easy to install Cassandra on all supported platforms (e.g. packaging) and have Cassandra use standard platform integrations.
Shepherd: TBD
Tracking Ticket: ?
Contributors: Add your name!
Platforms / Runtimes
CASSANDRA-9608 introduces support for Java 11. We'll want to verify that Cassandra under Java 11 meets expectations of stability.
Shepherd: TBD
Tracking Ticket: TBD
Contributors: Add your name!
Cluster Upgrade
We've historically had numerous bugs concerning upgrading clusters from one version to the other. Let's establish the supported upgrade path and ensure that users can safely perform the upgrades in production.
Shepherd: Ariel Weisberg
Tracking Ticket: TBD
Contributors: Tommy Stendahl
Documentation
Many sections of our documentation are incomplete or wrong. Let's deliver a functional but also well documented 4.0 release.
Shepherd: Sign up!
Tracking Ticket: TDB
Contributors: Joey Lynch, Add your name!
Features / Substantial Changes
Transient Replication
One of the more exciting experimental features shipping with 4.0 is transient replication (CASSANDRA-14697). Transient Replication is experimental so the expectation is that it doesn't negatively impact non-transient use cases, there are no known issues, and it's tested to the extent that the feature works and is testable.
Shepherd: Ariel Weisberg
Tracking Ticket: CASSANDRA-14697
Contributors: TBD
Configurable Storage Port
C* 4.0 introduces support for configurable storage ports to enable simpler deployment on elastic infrastructure / schedulers that require applications to bind to dynamically-allocated ports.
Shepherd: TBD
Tracking Ticket: CASSANDRA-14697
Contributors: TBD
Full-SSTable Streaming
CASSANDRA-14556 improved the performance of streaming by enabling SSTables to be streamed directly from disk, bypassing the full read path and eliminating CPU cost / heap pressure.
Shepherd: Dinesh Joshi
Tracking Ticket: TBD
Contributors: Add your name!