Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

Current state: Proposed

Discussion thread:

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Today, users who need the CQLSH client are forced to download the full Cassandra binaries, including the server components which they don’t need. This download is approximately 50 megabytes. The CQLSH client is only 110 KB.

Additionally, although the CQLSH client is written in Python and should work easily on Windows,  Windows users can’t install an easily executable CQLSH because they can’t use the provided Linux shell script ‘cqlsh’.  Instead, they have to find a way to create a start script or install CQLSH from an external project. Distribution on PyPI solves this problem, and it is pure Python, so its it is as easy to support the client on Windows as on MacOS or Linux.

Python 3.4 (March 2014) added PIP as the default Python package manager.   PIP offers end users a lightweight, friendly command line interface for users to install, upgrade and uninstall Python packages from the Python Package Index (PyPI.org). PIP works on a variety of platforms, including Linux, MacOS and Windows.

Support for installing CQLSH with PIP was added unofficially outside of the Apache project in October 2013.  The CQLSH project has been maintained continually since as a separate project. CQLSH is installable on any platform supporting Python with PIP.  Recent code refactoring of CQLSH with CASSANDRA-17531 and CASSANDRA-17534 17684 (in progress) have improved its modularity as a reusable package.

This CEP proposes an official distribution of CQLSH on PyPI and to incorporate this distribution into the Apache Cassandra release process.

Audience

Cassandra end users who wish to use only the CQLSH client

Goals

Publishing a CQLSH python package on PyPI.org with the contribution of the existing third-party package will provide the following benefits for the Apache Cassandra project:

  • An official Apache client distribution of CQLSH which does not include the server-side components. For users, this will be as simple as ‘pip install cqlsh’.
  • Python console script declaration to create a Windows executable .EXE, which does not require path setup.
  • A CQLSH package that can be imported and used by Python developers. For example, a Jupyter kernel.

Non-Goals

  • No change to existing functionality for RPM, tarball, or brew installations.
  • Changing the existing versioning scheme for CQLSH. This topic has been raised on the mailing list and merits further discussion but is outside the scope of this  CEP, which is solely about transferring ownership of publishing the current Python package

Proposed Changes

We propose the following changes to integrate the project into the Apache Cassandra source code base 

  1. Incorporate and add the following files:
    1. README.md for the pypi.org project page
    2. pyproject.toml
    3. setup.cfg
    4. setup.py
    5. pylib/cqlshlib/__init__.py
    6. pylib/cqlshlib/__main__.py
  2. A document detailing procedures for releasing to PyPI.org. This document should include details on:
    1. How release to PyPI can be integrated into the build process. Can this be done with automation? 
    2. How will credentials, permissions and ownership of packages on PyPI be managed?

New or Changed Public Interfaces

<tbd>

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?

...

  • When will we remove the existing behavior?

Not applicable

Test Plan

  • Successful integration tests
  • For any additional requirements from the Cassandra community, will seek guidance on the discussion thread

Rejected Alternatives


Existing Unofficial Repository

While the current repository has proven popular, with over 40,000 downloads per month, its existence outside of the Apache project has resulted in it being maintained by just one or two individuals. It also requires copying and duplicating the code base into a separate repo. It’s also less secure as there is no guarantee that the PyPi package matches the distribution code on Apache GitHub, such as potentially including malware.

Anaconda

As a pure Python package, CQLSH is better supported with the default package installer for Python. Anaconda packages exist for CQLSH, but are not currently maintained. If there is enough interest in an Anaconda package, this should be discussed in a separate CEP.