Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info
titleDRAFT

The CIP process itself has not been approved by the community yet. This is a strawman draft to start the conversation to such a process.

This page describes a proposed the Cassandra Improvement Enhancement Proposal (CIPCEP) process, for proposing a putting forward major change changes to Cassandra . It has been adopted from similar processes for Kafka and Spark.

To create your own CIP, click on 

Create from template
blueprintModuleCompleteKeycom.adaptavist.confluence.contentFormattingMacros:cfm-blueprint
contentBlueprintId548ee77a-d08c-48ae-83b6-2df3b4ab49bc
templateName548ee77a-d08c-48ae-83b6-2df3b4ab49bc
createResultview
titleCIP-NEXT: Insert Title Here
buttonLabelCreate CIP
. If you don't have permission, please send an email with your Wiki ID to dev@cassandra.apache.org and request permission. Also add an entry to the table CIPs under discussion.

Table of Contents

Purpose

The purpose of an CIP is to inform and involve the user community in major improvements to the Cassandra codebase throughout the development process, to increase the likelihood that user needs and compatibility are met. It is similar to a product requirement document commonly used in product management.

CIPs should be used for significant user-facing or cross-cutting changes, not small incremental improvements. When in doubt, if a committer thinks a change needs an CIP, it does.

Cassandra requires a high level of compatibility between releases, especially to ensure rolling upgrades are possible. Core architectural elements can't break compatibility or shift functionality from release to release. As a result each new major feature or public api has to be done in a way that we can stick with it going forward.

Areas of compatibility, or "public interfaces" to the project, are 

  • native protocol (and CQL)
  • gossip and the messaging service
  • pluggable components (SPIs) like authorisation, triggers, ..?
  • commitlog, hintlog, cache files
  • sstables components 
  • configuration
  • jmx mbeans (including metrics)
  • monitoring
  • client tool classes
  • command line tools and arguments
  • operational routines
  • ...??

When making changes or additions that impact these areas we need to think through what we are doing as best we can prior to release. And as we go forward we need to stick to our decisions as much as possible. All technical decisions have pros and cons so it is important we capture the thought process that lead to a decision or design to avoid flip-flopping needlessly.

Hopefully we can make these proportional in effort to their magnitude — small changes should just need a couple brief paragraphs, whereas large changes need detailed design discussions.

This process also isn't meant to discourage incompatible changes — proposing an incompatible change is totally legitimate. Sometimes we will have made a mistake and the best path forward is a clean break that cleans things up and gives us a good foundation going forward. Rather this is intended to avoid accidentally introducing half thought-out interfaces and protocols that cause needless heartburn when changed. Likewise the definition of "compatible" is itself squishy: small details like which errors are thrown when are clearly part of the contract but may need to change in some circumstances, likewise performance isn't part of the public contract but dramatic changes may break use cases. So we just need to use good judgement about how big the impact of an incompatibility will be and how big the payoff is.

What is considered a "major change" that needs a CIP?

Any of the following should be considered a major change:

  • Any major new feature, subsystem, or piece of functionality
  • Any change that impacts the public interfaces of the project

Not all compatibility commitments are the same. We need to spend significantly more time on the wire protocols as these break code in lots of clients, cause downtime releases, etc. Public API/SPIs are next as they cause people to rebuild code and/or maintain abstraction layers in third-party libraries/frameworks. Configuration, monitoring, and command line tools can be faster and looser — changes here will break monitoring dashboards and require a bit of care during upgrades but aren't as big a burden.

For the most part monitoring, command line tool changes, and configs are added with new features so these can be done with a single CIP.

What should be included in a CIP?

A CIP should contain the following sections:

  • Motivation: The problem to be solved.
  • Audience: The intended client audience. Examples include data scientists, data engineers, library devs, devops, etc. A single CIP can have multiple target personas. 
  • Goals: What must this allow users to do, that they can't currently.
  • Non-Goals: What problem(s) is this proposal not designed to solve.
  • Proposed Change: The new thing you want to do. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences, depending on the scope of the change.
  • New or Changed Public Interfaces: Impact to any of the "compatibility commitments" described above. We want to call these out in particular so everyone thinks about them.
  • Migration Plan and Compatibility: If this feature requires additional support for a no-downtime upgrade describe how that will work.
  • Rejected Alternatives: What are the other alternatives you considered and why are they worse? The goal of this section is to help people understand why this is the best solution now, and also to prevent churn in the future when old alternatives are reconsidered.

Who should initiate the CIP?

for discussion and decision-making.


Table of Contents

...

Purpose

Cassandra Enhancement Proposals provide a process for the proposal, discussion and endorsement of new feature development in Cassandra.  CEPs confer advantages to patch authors by building legitimacy for changes within the community, and obtaining early consent for the direction of development.  It is up to feature authors to determine if a CEP should be pursued for any piece of work, balancing the costs of a more burdensome process against the benefits of early community input and endorsement.  As CEPs become more common, it is anticipated that project members will become less permissive to large changes that haven't attempted to seek consent beforehand, feeling freer to request major changes that they feel are suitable.

It is highly recommended to pursue a CEP for significant user-facing or changes that cut across multiple subsystems.  Community feedback is able to provide information on:

  • Unexpected edge cases that may confound work or otherwise complicate it 
  • Major users and their expectations of existing or future behaviour
  • Project attitudes towards specific categories of feature
  • Project expectations around how a feature should be structured and delivered

The CEP process aims to be lightweight and flexible.  One may be initiated with nothing more than a title to begin, expanding as a working group materialises and ideas crystallise. See Scott Andreas' 2019 NGCC presentation for a community perspective, and how this ties into the lifecycle and evolution of the project.

What a CEP is not

CEPs are not intended to presage a return to waterfall development, and an acceptance of an CEP provides no absolute guarantee that any final product will be accepted. Work-invalidating insights can hit late in development, invalidating an idea, despite significant work being done.  The CEP merely mitigates this risk, and helps build legitimacy for a change so that technical difficulties may be discovered early, and non-technical objections handled before work begins.

Who should initiate a CEP?

Anyone can initiate a CEP Anyone can initiate a CIP but you shouldn't do it unless you have an the intention and know-how of getting the work done to implement itcapability to complete the proposed change.

A CIP CEP needs to attract a Shepherd that is a PMC member who is , a Cassandra Committer committed to shepherding the proposed change throughout the entire guiding the proposal through the process. Although the a shepherd can may delegate or work with other committers in the development process, the shepherd is , they are ultimately responsible for the success or failure of the CIP. Responsibilities of the shepherd a CEP.  Responsibilities include, but are not limited to:

  • Be the advocate Advocating for the proposed change
  • Help push forward on design and achieve consensus among key stakeholders
  • Review code changes, making sure the change follows project standards
  • proposal
  • Ensuring the working group achieves consensus
  • Ensuring the working group seeks feedback from relevant stakeholders and users, and iterates Get feedback from users and iterate on the design & implementation (see below for additional CEP documentation)
  • Ensuring project standards of development and quality are met
  • Ensuring changes match the CEP Uphold the quality of the changes, including verifying whether the changes satisfy the goal of the CIP and are absent of critical bugs before releasing them

What should be included in a CEP?

A CEP wiki page should aim to:

  • Promote collaboration during the discovery phase of a new feature
  • Serve as a permanent document describing the feature that evolves along with its development

A CEP should contain the following sections: 

  • Scope,

  • Goals (and non-goals),

  • Description of Approach,

  • Test Plan covering performance, correctness, failure, and boundary conditions (as applicable),

  • Timeline,

  • Mailing list / Slack channels,

  • Related JIRA tickets.

The Process

Here is the process for making a KIP:CEP:

(Optional): For work that is highly fluid and not yet ready for hardening in a wiki article nor broad dev list announcement, create a gdoc with the CEP template and add a link under the "CEP's in draft". This area is recommended for things that are nascent but with a high degree of certainty of eventual dev list proposal of a CEP.

  1. To create your own CEP, click on 

    Create from template
    templateName96600065
    templateId96600065
    titleCEP

    Click  Create from templateblueprintModuleCompleteKeycom.adaptavist.confluence.contentFormattingMacros:cfm-blueprintcontentBlueprintId548ee77a-d08c-48ae-83b6-2df3b4ab49bctemplateName548ee77a-d08c-48ae-83b6-2df3b4ab49bccreateResultviewtitleCIP

    -NEXT: Insert Title Here
    buttonLabelCreate

    CIP

    CEP
    .
    If you don't have permission, please send an email with your Wiki ID to dev@cassandra.apache.org and request permission. Also add an entry to the table CEPs under discussion.

    . Take the next available CIP CEP number and give your proposal a descriptive heading. e.g. "CIP CEP 1: Proposing an Apache Cassandra Management process".

  2. Fill in the sections as described above.

  3. Start a [DISCUSS] thread on the Apache mailing list. Please ensure that the subject of the thread is of the format [DISCUSS] CIPCEP-{your CIP CEP number} {your CIP CEP heading} The discussion should happen on the mailing list not on the wiki since the wiki comment system doesn't work well for larger discussions. In the process of the discussion you may update the proposal. You should let people know the changes you are making. When you feel you have a finalized proposal .

  4. As the CEP nears completion, consider adding any additional design documentation (see below) to the CEP, especially where it summaries working group discussions.

  5. Once the proposal is finalized call a [VOTE] to have the proposal adopted. These proposals are more serious than code changes and more serious even than release votes. The criteria for acceptance is lazy is consensus (3 binding +1 votes and no binding vetoes). The vote should remain open for 72 hours.

  6. Please update the CIP CEP wiki page, and the index below, to reflect the current stage of the CIP CEP after a vote. This acts as the permanent record indicating the result of the CIP CEP (e.g., Accepted or Rejected). Also report the result of the CIP CEP vote to the voting thread on the mailing list so the conclusion is clear.



Example CEP Design Documentation

After the CEP is opened and a working group is active, to help flesh out the implementation constraints, here are some suggestions for additional discussion and documentation that can go into the CEP:

  • Motivation: The problem to be solved.
  • Audience: The intended client audience. Examples include data scientists, data engineers, library devs, devops, etc. A single CEP can have multiple target personas.

...

  •  
  • Proposed Change: The new thing you want to do. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences, depending on the scope of the change.
  • New or Changed Public Interfaces: Impact to any of the "compatibility commitments" described above. We want to call these out in particular so everyone thinks about them.
  • Migration Plan and Compatibility: If this feature requires additional support for a no-downtime upgrade describe how that will work.
  • Rejected Alternatives: What are the other alternatives you considered and why are they worse? The goal of this section is to help people understand why this is the best solution now, and also to prevent churn in the future when old alternatives are reconsidered.



Compatibility Concerns

Cassandra requires a high level of compatibility between releases to ensure rolling upgrades are possible, as well as supporting third-party libraries and tools.

These areas of compatibility are 

  • native protocol (and CQL)
  • gossip and the messaging service
  • pluggable components (SPIs) like authorisation, triggers, …
  • commitlog, hintlog, cache files
  • sstables components 
  • configuration
  • jmx mbeans (including metrics)
  • monitoring
  • client tool classes
  • command line tools and arguments
  • operational routines



...

List of CEPs

Adopted CEPs

CEPs under discussion

CEPComment

CEP-1: Apache Cassandra Management Process(es)

Sent emails to Dev discussion group.

CEP-2: Kubernetes Operator

Emails periodically sent to dev list. SIG meetings held periodically.

CEP-12: Diagnostic Events in virtual tables

CIP round-up

Next CIP Number: 2

Use this number as the identifier for your KIP and increment this value.

Adopted CIPs

...

Release

CIPs under discussion

CIPCommentCIP-1: Proposing an Apache Cassandra Management process
Sent emails to Dev discussion group.
Work tracked under CASSANDRA-14395.

Dormant/inactive CIPs

...

CEPs in draft

Dormant / Inactive CEPs

CEPComment


Discarded CEPs

CEPComment
CEP-18: Improving ModularityCEP withdrawn, discussion ended up with wanted to consider each ticket on their own, rather than considering them as a whole in a CEP. See email thread on dev@ discussion group. 
CEP-23: Enhancement for Sparse Data SerializationCEP withdrawn, discussion indicated that CEP was not the proper form for the change.  Change can not be made within the confines of the the outlined CEP.

Discarded CIPs

CIPComment