Status

Current state: Proposed

Discussion thread:

Vote thread: tbd

JIRA: CASSANDRA-18654

Released: tbd

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Today, users who need the CQLSH client are forced to download the full Cassandra binaries, including the server components which they don’t need. This download is approximately 50 megabytes. The CQLSH client is only 110 KB.

Additionally, although the CQLSH client is written in Python and should work easily on Windows,  Windows users can’t install an easily executable CQLSH because they can’t use the provided Linux shell script ‘cqlsh’.  Instead, they have to find a way to create a start script or install CQLSH from an external project. Distribution on PyPI solves this problem, and it is pure Python, so it is as easy to support the client on Windows as on MacOS or Linux.

Python 3.4 (March 2014) added PIP as the default Python package manager. PIP offers end users a lightweight, friendly command line interface for users to install, upgrade and uninstall Python packages from the Python Package Index (PyPI.org). PIP works on a variety of platforms, including Linux, MacOS and Windows.

Support for installing CQLSH with PIP was added unofficially outside of the Apache project in October 2013.  The CQLSH project has been maintained continually since as a separate project. CQLSH is installable on any platform supporting Python with PIP.  Recent code refactoring of CQLSH with CASSANDRA-17531 and CASSANDRA-17684 (in progress) have improved its modularity as a reusable package.

This CEP proposes an official distribution of CQLSH on PyPI and to incorporate this distribution into the Apache Cassandra release process.

Audience

Cassandra end users who wish to use only the CQLSH client

Goals

Publishing a CQLSH python package on PyPI.org with the contribution of the existing third-party package will provide the following benefits for the Apache Cassandra project:

  • An official Apache client distribution of CQLSH which does not include the server-side components. For users, this will be as simple as ‘pip install cqlsh’.
  • Python console script declaration to create a Windows executable .EXE, which does not require path setup.
  • A CQLSH package that can be imported and used by Python developers. For example, a Jupyter kernel.

Non-Goals

  • No change to existing functionality for RPM, tarball, or brew installations.
  • Changing the existing versioning scheme for CQLSH. This topic has been raised on the mailing list and merits further discussion but is outside the scope of this  CEP, which is solely about transferring ownership of publishing the current Python package

Proposed Changes

We propose the following changes to integrate the project into the Apache Cassandra source code base 

  1. Incorporate and add the following files:
    1. README.md for the pypi.org project page
    2. pyproject.toml
    3. setup.cfg
    4. setup.py
    5. pylib/cqlshlib/__init__.py
    6. pylib/cqlshlib/__main__.py
  2. A document detailing procedures for releasing to PyPI.org. This document should include details on:
    1. How release to PyPI can be integrated into the build process. Can this be done with automation? 
    2. How will credentials, permissions and ownership of packages on PyPI be managed?

New or Changed Public Interfaces

<tbd>

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?

Existing users would see no impact on the Apache distribution. Users may continue to install the entire Apache Cassandra project with CQLSH in tarball, RPM, apt-get and source formats. 

For users already using PIP install,  the incorporation as an official project will align the release timelines with the official project and reduce delays.

  • If we are changing behavior, how will we phase out the older behavior?

The existing repository at github.com/jeffwidman/cqlsh will be retired. 

  • If we need special migration tools, describe them here.

Not applicable

  • When will we remove the existing behavior?

Not applicable

Test Plan

  • Successful integration tests
  • For any additional requirements from the Cassandra community, will seek guidance on the discussion thread

Rejected Alternatives


Existing Unofficial Repository

While the current repository has proven popular, with over 40,000 downloads per month, its existence outside of the Apache project has resulted in it being maintained by just one or two individuals. It also requires copying and duplicating the code base into a separate repo. It’s also less secure as there is no guarantee that the PyPi package matches the distribution code on Apache GitHub, such as potentially including malware.

Anaconda

As a pure Python package, CQLSH is better supported with the default package installer for Python. Anaconda packages exist for CQLSH, but are not currently maintained. If there is enough interest in an Anaconda package, this should be discussed in a separate CEP.

  • No labels