Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

cassandra-diff and associated documentation can be found at: https://github.com/apache/cassandra-diff. Contributors are encouraged to run diff tests against clusters they manage and report issues to ensure workload diversity across the project.

✅ System Tables and Internal Schema

Shepherd: Aleksey Yeshchenko

This task covers a review of and minor bug fixes to local and distributed system keyspaces. Planned work in this area is now complete.

Issues identified and resolved included:

  • Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyCASSANDRA-15454
  • Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyCASSANDRA-15441
  • Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyCASSANDRA-15385
  • Jira
    serverASF JIRA
    serverId5aa69414-a9e9-3523-82ec-879b028fb15b
    keyCASSANDRA-15398

⏳ Source Audit and Performance Testing: Streaming

Shepherds: Aleksey Yeshchenko, Dinesh Joshi

ETA: Dec 31, 2019

This task covers an audit of the Streaming implementation in Apache Cassandra 4.0. In this release, contributors have implemented full-SSTable streaming to improve performance and reduce memory pressure. Internode messaging changes implemented in CASSANDRA-15066 adjacent to streaming suggested that review of the streaming implementation itself may be desirable. Prior work also covered performance testing of full-SSTable streaming.

Two remaining issues are being addressed with partial streaming of compressed SSTables, with a patch pending: 

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyCASSANDRA-13938
. One additional item is in flight to address a minor issue with backpressure and the threading model; this change will be localized and very small in scope.

Further work is not essential to unblock streaming for Beta/GA, though small improvements may follow if bugs or performance issues are identified during later-stage testing.

⏳WIP: Test Infrastructure / Automation: "Harry"

Shepherd: Alex Petrov, Benedict Elliott Smith

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyCASSANDRA-15348

Harry is a component for fuzz testing and verification of the Apache Cassandra clusters at scale. Harry allows to run tests that are able to validate state of both dense nodes (to test local read-write path) and large clusters (to test distributed read-write path), and do it efficiently. Harry defines a model that holds the state of the database, generators that produce reproducible, pseudo-random schemas, mutations, and queries, and a validator that asserts the correctness of the model following execution of generated traffic. See CASSANDRA-15348 for additional details.

Development of Harry is currently in progress. Once complete, contributors envision its black-box model and verifier to act as a test to which compute power can be dedicated indefinitely. Harry's generators and model are also useful toward writing targeted property-based tests. Python-based dtests are good candidates for migration from Python/Byteman to in-JVM dtests paired with Harry's model and generators.

Local Read/Write Path: IndexInfo (CASSANDRA-11206)

Shepherd: Jordan West

Users upgrading from Cassandra 3.0.x to trunk will pick up CASSANDRA-11206 in the process. Contributors to 4.0 testing and validation have allocated time to testing and validation of these changes via source audit and implementation of property-based tests (currently underway). The majority of planned work here is complete, with a final set of perf tests in progress. No correctness issues were identified via the source audit and randomized testing. Minor cleanup and refactoring may follow, but these changes are expected to be small in scope, if any.

Issues identified and resolved included:

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyCASSANDRA-15469

Jira
serverASF JIRA
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyCASSANDRA-15470

⏳WIP: Local Read/Write Path: Upgrade and Diff Test

Shepherd: Yifan Cai

Execution of upgrade and diff tests via cassandra-diff have proven to be one of the most effective approaches toward identifying issues with the local read/write path. These include instances of data loss, data corruption, data resurrection, incorrect responses to queries, incomplete responses, and others. Upgrade and diff tests can be executed concurrent with fault injection (such as host or network failure); as well as during mixed-version scenarios (such as upgrading half of the instances in a cluster, and running upgradesstables on only half of the upgraded instances).

Upgrade and diff tests are expected to continue through the release cycle, and are a great way for contributors to gain confidence in the correctness of the database under their own workloads.

Local Read/Write Path: Other Areas

Testing in this area refers to the local read/write path (StorageProxy, ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still finding numerous bugs and issues with the 3.0 storage engine rewrite (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the local read/write path with techniques such as property-based testing, fuzzing (example), and a source audit.

...

Shepherd: Marcus Eriksson

Tracking Ticket: TBD 

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyCASSANDRA-15581

Contributors: Jordan West

...

In past releases we've unknowingly broken metrics integrations and introduced performance regressions in metrics collection and reporting. We strive in 4.0 to not do that. Metrics should work well!

Shepherd: TBD Romain Hardouin

Tracking Ticket: ? 

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyCASSANDRA-15582

Contributors:  Romain HardouinTBD

Tooling: Bundled / First-Party

...

Tracking Ticket: TBD

Contributors: Add your name!Sumanth Pasupuleti (Priam)

Test Frameworks, Tooling, Infrastructure / Automation

...

Many sections of our documentation are incomplete or wrong. Let's deliver a functional but also well documented 4.0 release.

Shepherd: Sign up! Dinesh Joshi, Joey Lynch

Tracking Ticket: TDB CASSANDRA-15353

Contributors: Joey Lynch, Add your name!Jon Haddad, Deepak Vohra

Features / Substantial Changes

Transient Replication

One of the more exciting experimental features shipping with Transient Replication is an experimental implementation of witness replicas included in Apache Cassandra 4.0 is transient replication (CASSANDRA-14697). Transient Replication As this feature is experimental so the expectation is that it , the focus of testing and validation in this release will be toward ensuring that its implementation doesn't negatively impact non-transient use cases, there are no known issues, and it's tested to the extent that the feature works and is testable.

Shepherd: Ariel Weisberg

Tracking Ticket: CASSANDRA-14697

Contributors: TBD

Configurable Storage Port

C* 4.0 introduces support for configurable storage ports to enable simpler deployment on elastic infrastructure / schedulers that require applications to bind to dynamically-allocated ports.

Shepherd: TBD

Tracking Ticket: CASSANDRA-14697

Contributors: TBD

Full-SSTable Streaming

CASSANDRA-14556 improved the performance of streaming by enabling SSTables to be streamed directly from disk, bypassing the full read path and eliminating CPU cost / heap pressure.

Shepherd: Dinesh Joshi

Tracking Ticket: 

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyCASSANDRA-14765

Contributors: Sumanth Pasupuleti, Add your name!