You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 18 Next »

The overarching goal of the 4.0 release is that Cassandra 4.0 should be at a state where major users would run it in production when it is cut. To gain this confidence there are various ongoing testing efforts involving correctness, performance, and ease of use. In this page we try to coordinate and identify blockers for subsystems before we can release 4.0

Tracking

We will track progress in Jira by tagging high level components with the 4.0-QA Jira label. We as a community can see our progress via a simple Jira query:

Jira Query for Tracking Progress

For each component we strive to have shepherds and contributors involved. Shepherds should be committers or knowledgeable component owners and are responsible for driving their blocking tickets to completion and ensuring quality in their claimed area, while contributors have signed up to help verify that subsystem by running tests or contributing fixes. Shepherds also ideally help set testing standards and ensure that we meet a high standard of quality in their claimed area.

If you are interested in contributing to testing 4.0, please add your name as a contributor and get involved in the the tracking ticket, and dev list/IRC discussions involving that component.

Targeted Components / Subsystems

We've tried to collect some of the major components or subsystems that we want to ensure work properly towards having a great 4.0 release. If you think something is missing please add it. Better yet volunteer to contribute to testing it!

✅ Internode Messaging

In 4.0 we're getting a new Netty based inter-node communication system (CASSANDRA-8457). As internode messaging is vital to the correctness and performance of the database we should make sure that all forms (TLS, compressed, low latency, high latency, etc ...) of internode messaging function correctly.

Shepherds: Benedict Elliott Smith, Aleksey Yeshchenko, Jason Brown

Tracking Tickets: Unable to render Jira issues macro, execution error. Unable to render Jira issues macro, execution error.

Contributors: Vinay Chella, Jordan West, Dinesh Joshi, Joey Lynch, Sumanth Pasupuleti, Benedict Elliott Smith, Aleksey Yeshchenko

Current Status: Planned work toward validating the stability and performance of internode messaging in Apache Cassandra 4.0 is nominally complete. Minor improvements and bug fixes may follow if identified during the alpha/beta/RC-cycle. A few remaining perf tests are expected, tracked as subtasks of  Unable to render Jira issues macro, execution error. .

The test plan for internode messaging changes is located at 4.0 Internode Messaging Test Plan. Note especially the "Randomised Testing" section, with tests implemented under test/burn/oac/net. Contributors have exercised these changes via the burn suite with over 16,000 cumulative core-hours dedicated to validation as implemented in Verifier.

Local Read/Write Path

Testing in this area refers to the local read/write path (StorageProxy, ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still finding numerous bugs and issues with the 3.0 storage engine rewrite (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the local read/write path with techniques such as property-based testing, fuzzing (example), and a source audit.

Shepherd: Aleksey Yeshchenko

Tracking Ticket: TBD

Contributors: Sam Tunnicliffe, Blake Eggleston

Distributed Read/Write Path: Coordination, Replication, and Read Repair

Testing in this area focuses on non-node-local aspects of the read-write path: coordination, replication, read repair, etc.

Shepherd: TBD

Tracking Ticket: TBD

Contributors: Blake Eggleston

Repair

We aim for 4.0 to have the first fully functioning incremental repair solution (CASSANDRA-9143)! Furthermore we aim to verify that all types of repair: (full range, sub range, incremental) function as expected as well as ensuring community tools such as Reaper work. CASSANDRA-3200 adds an experimental option to reduce the amount of data streamed during repair, we should write more tests and see how it works with big nodes.

Shepherd: Blake Eggleston

Tracking Ticket: TBD

Contributors: Marcus Eriksson, Vinay Chella


Compaction

Alongside the local and distributed read/write paths, we'll also want to validate compaction. CASSANDRA-6696 introduced substantial changes/improvements that require testing (esp. JBOD).

Shepherd: Marcus Eriksson

Tracking Ticket: TBD

Contributors: Jordan West


Metrics

In past releases we've unknowingly broken metrics integrations and introduced performance regressions in metrics collection and reporting. We strive in 4.0 to not do that. Metrics should work well!

Shepherd: TBD

Tracking Ticket: ?

Contributors: Romain Hardouin

Tooling: Bundled / First-Party

Test plans should cover bundled first-party tooling and CLIs such as nodetool, cqlsh, and new tools supporting full query and audit logging (CASSANDRA-13983, CASSANDRA-12151).

Shepherd: Sam Tunnicliffe

Tracking Ticket: TBD

Contributors: Vinay Chella

Tooling: External Ecosystem

Many users of Apache Cassandra employ open source tooling to automate Cassandra configuration, runtime management, and repair scheduling. Prior to release, we need to confirm that popular third-party tools such as ReaperPriam, etc. function properly.

Shepherd: Sam Tunnicliffe

Tracking Ticket: TBD

Contributors: Add your name!

Test Frameworks, Tooling, Infrastructure / Automation

This area refers to contributions to test frameworks/tooling (e.g., dtests, QuickTheories, CASSANDRA-14821), and automation enabling those tools to be applied at scale (e.g., replay testing via Spark-based replay of captured FQL logs).

Shepherd: Jordan West

Tracking Ticket: TBD

Contributors: Add your name!

Cluster Setup and Maintenance

We want 4.0 to be easy for users to setup out of the box and just work. This means having low friction when users download the Cassandra package and start running it. For example, users should be able to easily configure and start new 4.0 clusters and have tokens distributed evenly. Another example is packaging, it should be easy to install Cassandra on all supported platforms (e.g. packaging) and have Cassandra use standard platform integrations.

Shepherd: TBD

Tracking Ticket: ?

Contributors: Add your name!


Platforms / Runtimes

CASSANDRA-9608 introduces support for Java 11. We'll want to verify that Cassandra under Java 11 meets expectations of stability.

Shepherd: TBD

Tracking Ticket: TBD

Contributors: Add your name!

Cluster Upgrade

We've historically had numerous bugs concerning upgrading clusters from one version to the other. Let's establish the supported upgrade path and ensure that users can safely perform the upgrades in production.

Shepherd: Ariel Weisberg

Tracking Ticket: TBD

Contributors: Tommy Stendahl

Documentation

Many sections of our documentation are incomplete or wrong. Let's deliver a functional but also well documented 4.0 release.

Shepherd: Sign up!

Tracking Ticket: TDB

Contributors: Joey Lynch, Add your name!

Features / Substantial Changes

Transient Replication

One of the more exciting experimental features shipping with 4.0 is transient replication (CASSANDRA-14697). Transient Replication is experimental so the expectation is that it doesn't negatively impact non-transient use cases, there are no known issues, and it's tested to the extent that the feature works and is testable.

Shepherd: Ariel Weisberg

Tracking Ticket: CASSANDRA-14697

Contributors: TBD

Configurable Storage Port

C* 4.0 introduces support for configurable storage ports to enable simpler deployment on elastic infrastructure / schedulers that require applications to bind to dynamically-allocated ports.

Shepherd: TBD

Tracking Ticket: CASSANDRA-14697

Contributors: TBD

Full-SSTable Streaming

CASSANDRA-14556 improved the performance of streaming by enabling SSTables to be streamed directly from disk, bypassing the full read path and eliminating CPU cost / heap pressure.

Shepherd: Dinesh Joshi

Tracking Ticket:  Unable to render Jira issues macro, execution error.

Contributors: Sumanth Pasupuleti, Add your name!



  • No labels