Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Asynchronous execution, preferably with a UUID for each execution to track the execution result;
  • A single command registry, to provide generic command metadata to the API consumers such as CLI tools, or other ecosystem products. This concept is borrowed from the Dropwizard metrics registry and shares the same idea;
  • A dedicated admin port with the native protocol behind it, allowing only admin commands, to address the concerns when the native protocol is disabled in certain circumstances e.g. the disablebinary command is executed;
  • Management via CQL must be developed alongside existing C* management API such as statically registered JMX MBeans, to ensure backward compatibility for ecosystem products that rely on it. However, this doesn't mean that new commands have to be implemented twice when they are required;
  • Reduce the cost of implementing a new command, meaning that once a new command is registered in the command registry, it is automatically available through all of the public interfaces we support e.g. JMX, CQL. There is no need to implement a new command for the CLI tools, such as the nodetool, or the corresponding MBeans, a new command must be available there out of the box.

Current Solutions

Currently, there are a few ways to perform management operations:

  • JMX MBeans - a standard way of performing C* management operations that expose an interface consisting of a set of readable or writable attributes, or both.
  • nodetool - is a command line utility that ships with C*. It is an essential tool for managing and monitoring a Cassandra cluster;
  • Cassandra Sidecar - a JVM process, distinct and separate in its lifecycle from the C* server process. This sidecar is designed to provide additional functionalities such as health checks, the execution of bulk commands etc. These services are accessible through a REST API, which internally leverages a jmxClient as its transport layer [1].
  • k8ssandra management API - provides a REST API for Cassandra nodes, allowing communication via Unix socket or HTTP(S) with optional TLS client authentication on the local machine. It integrates a Cassandra java-driver client that runs over the configured Unix socket referenced by the C* pid file, and an implementation of the java agent that intercepts and processes CQL queries, translating them into corresponding JMX MBean method calls. This API simplifies cluster management and monitoring by providing RESTful access to management commands and detailed metrics [2].

Other Vendor Solutions

How other vendors have solved the same problem:

  • Apache Ignite offers a variety of management interfaces, among which the REST API as a database plugin and dynamically generated JMX MBeans. However, the preferred tool for many is the CLI tool, which acts as a wrapper over a standard thin client. This CLI connects to the cluster through a special admin port, typically set to 11211, to perform its management tasks, better known as compute tasks, across the nodes in the cluster. These compute tasks are part of the standard Ignite API for users [13].
  • Apache HBase uses Google’s protobuf to describe the RPC interfaces they expose to clients, for example, the Admin and Connection interfaces. The Admin and Connection interfaces in HBase don’t require a dedicated admin port to initiate a connection for cluster management. These classes utilize the client API to communicate with the HBase cluster over the same ports that are used for regular client access to HBase services [24].

Command Specifications

At present, Apache Cassandra doesn't advocate a unified approach to the specification of management commands that are promoted directly by the C* node itself. The heart of these management commands lies in the JMX MBeans. While each MBean provides operations on the same keyspaces and tables, they differ subtly in aspects such as command parameter names and their order. Additionally, MBeans do not provide fine-grained command metadata to a user. This limitation causes command-line interfaces and other tools that depend on the JMX API to create their layer of organizing MBean operations, adding an extra layer of grouping to the process. 

For example, the same operations on a keyspace are represented differently for the user's public API:

  • The StorageServiceMBean exposes
    • forceKeyspaceCompactionForTokenRange - run compaction by token range;
    • forceKeyspaceCompactionForPartitionKey - run compaction by given partition key;
    • forceKeyspaceCompaction - run compaction on given table names;
  • The NodeTool CLI tool combines these operations these operations into a single  `compact` command and operates based on the command's input arguments provided by a user;
  • k8ssandra management API exposes the command in much the same way as the nodetool CLI, combining the listed MBeand methods under the single `compact`, adding small variations of the input arguments;

To solve this problem we can use the same design approach that has been adopted by the Dropwizard Metrics library for metrics [35], choosing the right granularity when creating command metadata. This means that all of the commands that are executed across the same keyspaces must have a common denominator of the command arguments and the order of the arguments.

...

Once the CommandRegistry is available, some out-of-the-box adapters must be developed to achieve the goals of the proposal and make the commands available through JMX and CQL:

  1. CQL Command Adapter - The CQL invoker validates the given arguments based on the command metadata from CommandRegistry and invokes the corresponding command;
  2. Dynamic Command MBean Adapter - New dynamic JMX MBeans for management operations are generated and exposed to public API based on available command metadata for use by the nodetool. The JMX MBeans that are now statically registered are still supported. However, they are deprecated in favour of new ones;
  3. Open API Adapter - an adapter that provides dynamically generated RESTful API endpoints based on the CommandRegistry metadata as well as `openapi.json` specification

Commands Virtual Table

All the commands and command definitions are available by querying a corresponding new virtual table based on metadata provided by the CommandRegistry.

...

draw.io Diagram
bordertrue
diagramNameCQL Management
simpleViewerfalse
width600
linksauto
tbstyletop
lboxtrue
diagramWidth912883
revision45

CQL Command Syntax

The commands below become valid. While many commands in Apache Cassandra are typically run on keyspaces and tables, and adopting a CQL syntax like `EXECUTE COMMAND rebuild ON keyspace.table` seems logical, it's more practical for the initial implementation to concentrate on accepting command arguments as straightforward key-value pairs or as a JSON string. This approach simplifies the early stages of development. Subsequently, the more intuitive CQL syntax can be introduced as an alias for these commands, enhancing usability and aligning with familiar patterns.

Execute Command

...

Basic Syntax

Code Block
languagesql
EXECUTE COMMAND forcecompact WITH JSON '{
    WITH    "keyspace" : "keyspace=distributed_test_keyspace",
    "table" : “tbl”,
    "keys" :    AND table=tbl
    AND keys=["k4", "k2", "k7"] }';


Code Block
languagesql
EXECUTE COMMAND rebuild 
	WITH JSON '{
    "keyspace" : “distributedkeyspace=distributed_test_keyspace”,keyspace
    AND    "sourceDataCenterName" : "datacenter1",
    "tokens" : null,
    "specificSources" : null,
    "excludeLocalDatacenterNodes" : true }';

Standard Syntax

Code Block
languagesql
EXECUTE COMMAND forcecompact 
    WITH keyspace=distributed_test_keyspace
    AND table=tbl
    AND keys=["k4", "k2", "k7"]sourceDataCenterName=datacenter1
    AND tokens=null
    AND specificSources=null
    AND excludeLocalDatacenterNodes=true;


Code Block
languagesql
EXECUTE COMMAND setconcurrentcompactors 
    WITH concurrent_compactors=5;

...

  • The CommandRegistry is available;
  • Some commands from the nodetool are migrated to the new CommandRegistry;
  • New dynamic MBeans representing the registered commands are available;
  • Added support for a new CQL syntax that accepts JSON a simple set of K-V parameters required for command execution;
  • Added to the nodetool a new command invoker, which is aware of new commands via the CommandRegistry;

...