This is a place to gather big ideas and think differently about the future of Cassandra. This list was initially started at ApacheCon 2022 in New Orleans. In a birds of a feather session after a full day of Cassandra talks, a diverse group of users and committers had a fun jam session. Just throw out the wildest ideas, and let's collect them. It was inspired by this talk given by Benedict Elliott Smith in 2015. 


If you feel compelled to expound on any of the points below, please create a new sub-page and link in this document. 


IdeaProposerCEPs or Jiras
Endless Partitions that you can read (why people should care about bucketing).Jeremy Hanna

0-allocation compaction and validation compaction (80% of allocations are user data)David Capwell

Repair: it should just work. Repair service should be internal to Cassandra.David Capwell

Maintenance scheduling in Cassandra (?) or adjacent distributed workflow service that does scheduling.Joey Lynch

Eliminate Repair (real-time repair). Alex Petrov

Global Arbitrary Sort and Offset across partition keys like Mongo does it ™Patrick McFadin

Multi-consistency level (multi ACK): block till LOCAL_QUORUM but give a second ACK on replication. For people who want performance, not just consistency. “Wait for replication delay” MySQL thing.

Jordan West

Joey Lynch



It should not matter how many tombstones you have.Branimir Lambov

Pagination that actually works (maybe snapshot isolation?).

Alex Petrov

Jordan West

Joey Lynch



COUNT that doesn’t involve ALLOW FILTERING.Patrick McFadin

Public Analytics/Batch Contact. End goal is to make it easier to use the data you already have and use it for analytics. Format should make it easier to make range queries upon.

Joey Lynch

Alex Petrov

Jeremy Hannah

Jordan West



CQL operations as a means to get rid of everything. Invoke operations via CQL.  Deprecate JMX. Add an ability to query the status of the operation. Have an ID of the running operation.Paulo MottaCEP-38: CQL Management API
Diagnostic events in Chronicle bin logs  + some kind of a visual "replayer" of events, something like "bootchart" Linux Kernel hasStefan Miklosovic

Unable to render Jira issues macro, execution error.


Self-tuning. Either basic built-in features for some settings or a node/DC controller using ML to optimize an output e.g. read latency P99.Romain Hardouin

Continuous query / notifications / subscription / notifier / listener for data and virtual tables.
(e.g. streaming CQL statements, not pagination isolation)
Mick Semb Wever

OR operator

Jordan West

Jeremy Hannah



Cost-based Query PlannerPatrick McFadin

Improve performance of IN queries with multiple rows in the same partition (it is slow now)

Jordan West

Joey Lynch



JOINs in CQLParick McFadin

Lifecycle on the CF level rather than row; retention policies. Move lifecycle into metadata.Joey Lynch

Dynamic TTL (could be related to the previous item, a way to implement it).

Paulo Motta

Joey Lynch



Unified Compaction Strategy.

Joey Lynch

Branimir Lambov



Out-of-process compaction. Or, taking it even further, Cloud-Aware Cassandra.Jeremy Hannah

Global 2i (basically accord + infinite partitions).Claude Warren

MVs that work (basically, accord).Jordan West

Modern build system (not ant). Gradle. Teh people have spoken.Benjamin Lerer

Formalize backups API / contractPaulo Motta

Efficient bulk load client interface. Goal: replace SSTables streaming in jobs (e.g. Spark) with storage agnostic format.
Handle replication? RBAC?
Romain Hardouin

Backups, but with a consideration of deduplicationJeremy Hannah

Live migration of keyspaces. Dual writing for table migration. Could be related to bulk import API. Same- and multi- cluster migration. 

Paulo Motta

Patrick McFadin



Actual Multi-Tenancy

Jordan West

Alex Petrov




Resource management / resource isolationMick Semb Wever

Infinite number of tablesJeremy Hannah

Elasticity. Scale down.Mick Semb Wever

µCassandraPatrick McFadin

Bad Partition HandlingCheng Wang

Zone Maps. Statistics over a block of data. Cardinality, number of nulls, visibility into the data. Can be used in the query planner.David Caldwell

Make it possible to replicate seamlessly and securely over (problematic?) geographic boundaries.Jeremy Hannah

Evaluate Quick protocol for internode messages.Joey Lynch

Data placement. Routing keys / Static Column (?). Support GDPR relationships, restrict placement of partitions to a subset of cluster. Joey Lynch

Snapshot Isolation. Restore data from specific snapshot / time travel.

Cheng Wang

Jordan West



Quality of Service labelingMick Semb Wever

CONTAINS / NOT CONTAINS queries for anything, including collections.Patrick McFadin

T-Shirt sizing of numTokensAaron Ploetz

Dynamic balancing of dataMick Semb Wever

Type-based cell resolving. General CRDTs, bitmaps, bitsets.


Transparent Data EncryptionCheng Wang

Fix live-migrate collections. Take a look at exposing complex and deleted columns metadata.

Benjamin Lerer

Joey Lynch

Alex Petrov



Formal contract about JVM version support. Looks like it might already be discussed on ML.

Jeremy Hannah

Mick Semb Wever



Codebase that is approachable for the new peopleEveryone

Bring LHF tag backErick Ramirez

Diverse and engaged communityErick Ramirez

Make it easier for new contributors to start on projects Erick Ramirez



  • No labels

2 Comments

  1. Patrick McFadin , do you mind if I move this page into the Discussion area ?