Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Current state: Under Discussion

Discussion thread: https://lists.apache.org/thread/pow83q92m666nqtwyw4m3b18nnkgj2y8

Slack: https://the-asf.slack.com/archives/CK23JSY2K/p1688662169018449

...

Cassandra makes use of the JMX to expose the management commands, such as taking a snapshot operation of specified keyspaces, with local JMX access enabled by default and remote access enabled by configuration when it's needed. JMX is used as the transport layer for command execution by all most of the management tools, such as the built-in command line tool, the `nodetool` that ships with the Cassandra itself, or the Cassandra Sidecar a standalone JVM process that runs alongside the Cassandra server daemon and is used for the configuration management and/or metrics exposure.

...

  • Asynchronous execution, preferably with a UUID for each execution to track the execution result;
  • A single command registry, to provide generic command metadata to the API consumers such as CLI tools, or other ecosystem products. This concept is borrowed from the Dropwizard metrics registry and shares the same idea;
  • A dedicated admin port with the native protocol behind it, allowing only admin commands, to address the concerns when the native protocol is disabled in certain circumstances e.g. the disablebinary command is executed;
  • Management via CQL must be developed alongside existing C* management API such as statically registered JMX MBeans, to ensure backward compatibility for ecosystem products that rely on it. However, this doesn't mean that new commands have to be implemented twice when they are required;
  • Reduce the cost of implementing a new command, meaning that once a new command is registered in the command registry, it is automatically available through all of the public interfaces we support e.g. JMX, CQL. There is no need to implement a new command for the CLI tools, such as the nodetool, or the corresponding MBeans, a new command must be available there out of the box.

Other Vendor Solutions

How other vendors have solved the same problem:

  • Apache Ignite offers a variety of management interfaces, among which the REST API as a database plugin and dynamically generated JMX MBeans. However, the preferred tool for many is the CLI tool, which acts as a wrapper over a standard thin client. This CLI connects to the cluster through a special admin port, typically set to 11211, to perform its management tasks, better known as compute tasks, across the nodes in the cluster. These compute tasks are part of the standard Ignite API for users [1].
  • Apache HBase uses Google’s protobuf to describe the RPC interfaces they expose to clients, for example, the Admin and Connection interfaces. The Admin and Connection interfaces in HBase don’t require a dedicated admin port to initiate a connection for cluster management. These classes utilize the client API to communicate with the HBase cluster over the same ports that are used for regular client access to HBase services [2].

Command Specifications

At present, Apache Cassandra doesn't advocate a unified approach to the specification of management commands that are promoted directly by the C* node itself. The heart of these management commands lies in the JMX MBeans. While each MBean provides operations on the same keyspaces and tables, they differ subtly in aspects such as command parameter names and their order. Additionally, MBeans do not provide fine-grained command metadata to a user. This limitation causes command-line interfaces and other tools that depend on the JMX API to create their layer of organizing MBean operations, adding an extra layer of grouping to the process. 

For example, the same operations on a keyspace are represented differently for the user's public API:

  • The StorageServiceMBean exposes
    • forceKeyspaceCompactionForTokenRange - run compaction by token range;
    • forceKeyspaceCompactionForPartitionKey - run compaction by given partition key;
    • forceKeyspaceCompaction - run compaction on given table names;
  • The NodeTool CLI tool combines these operations into a single  `compact` command and operates based on the command's input arguments provided by a user;

To solve this problem we can use the same design approach that has been adopted by the Dropwizard Metrics library for metrics [3], choosing the right granularity when creating command metadata. This means that all of the commands that are executed across the same keyspaces must have a common denominator of the command arguments and the order of the arguments.

Command Registry

The starting point for management operations is the CommandRegistry (or OperationManager), which is a collection of all commands or subcommands for C* management commands.

Command Registry Adapters

Once the CommandRegistry is available, some out-of-the-box adapters must be developed to achieve the goals of the proposal and make the commands available through JMX and CQL:

  • CQL Command Adapter - The CQL invoker validates the given arguments based on the command metadata from CommandRegistry and invokes the corresponding command;
  • Dynamic Command MBean Adapter - New dynamic JMX MBeans for management operations are generated and exposed to public API based on available command metadata for use by the nodetool. The JMX MBeans that are now statically registered are still supported. However, they are deprecated in favour of new ones; 

Commands Virtual Table

All the commands and command definitions are available by querying a corresponding new virtual table based on metadata provided by the CommandRegistry.

Management Diagram

...

CQL Command Syntax

The commands below become valid. While many commands in Apache Cassandra are typically run on keyspaces and tables, and adopting a CQL syntax like `EXECUTE COMMAND rebuild ON keyspace.table` seems logical, it's more practical for the initial implementation to concentrate on accepting command arguments as straightforward key-value pairs or as a JSON string. This approach simplifies the early stages of development. Subsequently, the more intuitive CQL syntax can be introduced as an alias for these commands, enhancing usability and aligning with familiar patterns.

Execute Command

JSON Syntax

...

languagesql

...

Cassandra Solutions

Currently, there are a few ways to perform management operations:

  • JMX MBeans - a standard way of performing C* management operations that expose an interface consisting of a set of readable or writable attributes, or both.
  • nodetool - is a command line utility that ships with C*. It is an essential tool for managing and monitoring a Cassandra cluster;
  • Cassandra Sidecar - a JVM process, distinct and separate in its lifecycle from the C* server process. This sidecar is designed to provide additional functionalities such as health checks, the execution of bulk commands etc. These services are accessible through a REST API, which internally leverages a jmxClient as its transport layer [1].
  • k8ssandra management API - provides a REST API for Cassandra nodes, allowing communication via Unix socket or HTTP(S) with optional TLS client authentication on the local machine. It integrates a Cassandra java-driver client that runs over the configured Unix socket referenced by the C* pid file, and an implementation of the java agent that intercepts and processes CQL queries, translating them into corresponding JMX MBean method calls. This API simplifies cluster management and monitoring by providing RESTful access to management commands and detailed metrics [2].

Other Vendor Solutions

How other vendors have solved the same problem:

  • Apache Ignite offers a variety of management interfaces, among which the REST API as a database plugin and dynamically generated JMX MBeans. However, the preferred tool for many is the CLI tool, which acts as a wrapper over a standard thin client. This CLI connects to the cluster through a special admin port, typically set to 11211, to perform its management tasks, better known as compute tasks, across the nodes in the cluster. These compute tasks are part of the standard Ignite API for users [3].
  • Apache HBase uses Google’s protobuf to describe the RPC interfaces they expose to clients, for example, the Admin and Connection interfaces. The Admin and Connection interfaces in HBase don’t require a dedicated admin port to initiate a connection for cluster management. These classes utilize the client API to communicate with the HBase cluster over the same ports that are used for regular client access to HBase services [4].

Command Specifications

At present, Apache Cassandra doesn't advocate a unified approach to the specification of management commands that are promoted directly by the C* node itself. The heart of these management commands lies in the JMX MBeans. While each MBean provides operations on the same keyspaces and tables, they differ subtly in aspects such as command parameter names and their order. Additionally, MBeans do not provide fine-grained command metadata to a user. This limitation causes command-line interfaces and other tools that depend on the JMX API to create their layer of organizing MBean operations, adding an extra layer of grouping to the process. 

For example, the same operations on a keyspace are represented differently for the user's public API:

  • StorageServiceMBean exposes
    • forceKeyspaceCompactionForTokenRange - run compaction by token range;
    • forceKeyspaceCompactionForPartitionKey - run compaction by given partition key;
    • forceKeyspaceCompaction - run compaction on given table names;
  • NodeTool CLI tool combines these operations into a single  `compact` command and operates based on the command's input arguments provided by a user;
  • k8ssandra management API exposes the command in much the same way as the nodetool CLI, combining the listed MBeand methods under the single `compact`, adding small variations of the input arguments;

To solve this problem we can use the same design approach that has been adopted by the Dropwizard Metrics library for metrics [5], choosing the right granularity when creating command metadata. This means that all of the commands that are executed across the same keyspaces must have a common denominator of the command arguments and the order of the arguments.

Command Registry

The starting point for management operations is the CommandRegistry (or OperationManager), which is a collection of all commands or subcommands for C* management commands.

Code Block
languagejava
titleCommandRegistry
linenumberstrue
collapsetrue
public interface CommandRegistry<A, R> extends Command<A, R>
{
    public Command<?, ?> command(String name);
    public Iterator<Map.Entry<String, Command<?, ?>>> commands();
}

Command API

The nodetool CLI uses the Airline annotation-based framework [6] to parse input arguments and execute management commands from the command line; these annotations already contain all the necessary command metadata including input arguments, their descriptions, and general command details. A significant limitation, however, is that the metadata is embedded in the CLI tool itself, making it inaccessible on the C* server node. As a result, the metadata can't be shared with other API consumers involved in management operations. Direct migration of all CLI commands to the CommandRegistry is not feasible as well. This is not only due to the obsolescence of the Airline library, but also because it lacks the necessary abstractions to support a transparent and aligned reflection of available commands in the CLI, JMX, and REST API (represented as k8ssandra management API project) that we have, and thus such a reflection requires a more narrow approach to ensure consistency and compatibility across different management interfaces.

Therefore, the Command API might look like this:

Code Block
languagejava
titleCommand<A, R>
linenumberstrue
collapsetrue
public interface Command<A, R>
{
    public String description();

    public Class<? extends A> argClass();

    public R execute(A arg);

    /** Custom output required to preserve backwards compatibility with the nodetool output. */
    public default void printResult(A arg, R res, Consumer<String> printer) {}
}


Code Block
languagejava
titleCompactCommand
linenumberstrue
collapsetrue
public CompactCommand implements Command<CompactCommandArg, Response> 
{
    public String description()
    {
        return "Force a (major) compaction on one or more tables or user-defined compaction on given SSTables";
    }

    public Class<? extends A> argClass()
    {
        return CompactCommandArg.class;
    }

    // The rest part of the class.
}


Code Block
languagejava
titleCompactCommandArg
linenumberstrue
collapsetrue
@ArgumentGroup(value = {"userDefined", "startToken", "partitionKey"}, optional = true, oneOf = true)
public class CompactCommandArg implements Serializable 
{
  @Argument(aliases = {"s", "split-output"}, description = "Use -s to not create a single big file", optional = true)
  public final boolean splitOutput;

  @Argument(aliases = {"user-defined"}, description = "Use --user-defined to submit listed files for user-defined compaction")
  public final boolean userDefined;

  @Argument(aliases = {"st", "start-token"}, description = "Use --user-defined to submit listed files for user-defined compaction", optional = true)
  public final String startToken;

  @Argument(aliases = {"et", "end-token"}, description = "Use -et to specify a token at which compaction range ends (inclusive)", optional = true)
  public final String endToken;

  @Argument(aliases = {"partition", "partition_key"}, description = "String representation of the partition key", optional = true)
  private String partitionKey;

  @Argument(aliases = {"keyspace"}, description = "The keyspace followed by one or many tables or list of SSTable data files when using --user-defined")
  public final String keyspaceName;

  @Argument(description = "The table names to compact")
  public final List<String> tables;

  // The rest part of the class.
}

Command Registry Adapters

Once the CommandRegistry is available, some out-of-the-box adapters must be developed to achieve the goals of the proposal and make the commands available through JMX, CQL and indirectly REST API:

  1. CQL Command Adapter - The CQL invoker validates the given arguments based on the command metadata from CommandRegistry and invokes the corresponding command;
  2. Dynamic Command MBean Adapter - New dynamic JMX MBeans for management operations are generated and exposed to public API based on available command metadata for use by the nodetool. The JMX MBeans that are now statically registered are still supported. However, they are deprecated in favour of new ones;
  3. Open API Adapter - an adapter that provides dynamically generated RESTful API endpoints based on the CommandRegistry metadata as well as `openapi.json` specification; 

Commands Virtual Table

All the commands and command definitions are available by querying a corresponding new virtual table based on metadata provided by the CommandRegistry.

Management Diagram

draw.io Diagram
bordertrue
diagramNameCQL Management
simpleViewerfalse
width
linksauto
tbstyletop
lboxtrue
diagramWidth883
revision8

CQL Command Syntax

The commands below become valid. While many commands in Apache Cassandra are typically run on keyspaces and tables, and adopting a CQL syntax like `EXECUTE COMMAND rebuild ON keyspace.table` seems logical, it's more practical for the initial implementation to concentrate on accepting command arguments as straightforward key-value pairs or as a JSON string. This approach simplifies the early stages of development. Subsequently, the more intuitive CQL syntax can be introduced as an alias for these commands, enhancing usability and aligning with familiar patterns.

Execute Command

Basic Syntax

Code Block
languagesql
EXECUTE COMMAND rebuild WITH JSON '{
    "keyspace" : “distributedforcecompact 
    WITH keyspace=distributed_test_keyspace”,keyspace
    "sourceDataCenterName" : "datacenter1",
    "tokens" : null,
    "specificSources" : null,
    "excludeLocalDatacenterNodes" : true }';

...

   AND table=tbl
    AND keys=["k4", "k2", "k7"];


Code Block
languagesql
EXECUTE COMMAND forcecompactrebuild 
    WITH	WITH keyspace=distributed_test_keyspace
    AND sourceDataCenterName=datacenter1
    AND   AND table=tbltokens=null
    AND specificSources=null
    AND keys=["k4", "k2", "k7"]excludeLocalDatacenterNodes=true;


Code Block
languagesql
EXECUTE COMMAND setconcurrentcompactors 
    WITH concurrent_compactors=5;

...

  • New dynamic JMX MBeans must be created to expose the available commands to the public API in a way that matches the corresponding CQL queries. The API that is provided by static MBeans and the CLI are too far apart.
  • The nodetool uses a newly created dynamic MBean to achieve both API alignment and API backward compatibility goals;
  • The nodetool parses the input arguments based on the command metadata it receives from the CommandRegistry;

Minimum Viable Product (MVP)

Although the scope has been described quite broadly, the minimum viable product includes the following changes:

  • The CommandRegistry is available;
  • Some commands from the nodetool are migrated to the new CommandRegistry;
  • New dynamic MBeans representing the registered commands are available;
  • Added support for a new CQL syntax that accepts JSON a simple set of K-V parameters required for command execution;
  • Added to the nodetool a new command invoker, which is aware of new commands via the CommandRegistry;

...