Status

Current state: Draft

Discussion thread: https://lists.apache.org/thread/z83zzp83lxfx11b7mf45vtdzs7dtryzb 

JIRA: -

Released: -

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).



Scope

TODO

Goals

  1. A reconfigurable authenticator, authorizer, and a role manager
  2. Configuration of multiple authentication mechanisms. Let the administrator define through CQL which are enabled globally and per user account.
  3. Select the authentication mechanism by negotiating it with the driver, negotiation protocol compatible with the existing drivers.
  4. Configuration of multiple role managers. Let the administrator define through CQL how they are selected.
  5. Proxy authorization which let a role execute commands on behalf of the other role
  6. LDAP authenticator and role manager
  7. Kerberos authentication
  8. Token-based authentication

The described changes do not aim to improve in-flight data encryption.

A shared security configuration across all the nodes is an interesting idea that may leverage improvements brought by Transactional Cluster Metadata (CEP-21). However, it is out of scope of this CEP and can be considered later if such a distributed configuration mechanism is introduced into Cassandra.

Approach

TODO

Timeline

TODO

Mailing list / Slack channels

Mailing list: 

Slack channel: 

Discussion threads: 

Related JIRA tickets

JIRA(s): 


Motivation

The Apache Cassandra database has a mechanism to authenticate and authorize database users.  It is often described as internal auth because credentials, roles, and permissions are all managed within the database.  For authentication, it can use database usernames/roles and passwords.  For authorization, administrators can grant or revoke permissions.  For reference, see the project documentation on CQL security and the operational documentation on configuring authentication and authorization.

In an enterprise environment, it is preferable, if not required, to connect isolated resources such as a database to a centralized provider for authentication and authorization.  Two widely adopted industry standards for authentication and authorization are

  • Kerberos for authentication - proving who you are with a variety of options.  This allows enterprise security teams and operators to have a central location to validate identity within their organization.
  • LDAP (Lightweight Directory Access Protocol) - validating identity and authority.  LDAP can classify users as part of groups or roles to give authorization to specific resources.  Enterprises often employ LDAP services to manage users and roles centrally.

The goal of this CEP is to contribute mature integrations with both Kerberos and LDAP providers to the Apache Cassandra project.  It includes facilities for transitioning to these external providers as well as general extension points for additional integrations.  The code has been in use in production by dozens of enterprises for several years, getting incremental improvements and refinements along the way.

Audience

Enterprises, security teams, and operators in the Apache Cassandra community.

This CEP gives enterprises, enterprise security teams, and operators a simple, well-tested facility to extend authentication and authorization beyond internal auth and includes integrations with Kerberos and LDAP providers. The Apache Cassandra community will benefit from mature implementations of these integrations.

Proposed Changes

We want to provide a set of authentication, authorization, and role management improvements. In particular, an implementation of an authenticator that can select one of the sub-authenticators according to the negotiation result. The authenticator is also capable of reconfiguring the sub-authenticators, in particular, creating a new instance and calling a setup method with a new configuration. 

The negotiation is already implemented in the DataStax driver. The driver sends an additional preamble along with the initial SASL authentication message. The preamble includes an identifier that the server uses to select the authenticator. The server reads and strips the preamble and passes that cleaned message to the selected SASL authenticator. The authenticator follows the authentication messages exchange as usual.

A role manager is used after a successful authentication to get the user roles. This is also the case when we would like to provide a role manager that uses one of the sub-managers depending on the selected authenticator. For example, it could be configured such that the users authenticated with internal authentication are authorized using the internal role manager, and LDAP is used otherwise. 

The negotiating authenticator instantiates and configures the managed sub-authenticators. We want to make it possible to add, remove or change the configuration of those sub-authenticators keeping the server online. The approach used during reconfiguration is to replace the existing instance with a new one, assuming that the new authenticator instance loads the new configuration from the (possibly modified) configuration. A similar approach is applied to the role manager. Adding or removing authenticators and role managers is possible only by enabling/disabling the existing components included in the static configuration. There is no support for dynamically adding a new authenticator or role manager. The administrator may enable/disable authenticators per user account by granting/revoking dedicated permissions though.

A proxy authorization is a mechanism where role A can grant a special permission to other role B, so that B after successful authentication is authorized as A and has all A permissions. We implement a new permission for this purpose.

LDAP role manager and the authenticator are based on the implementation offered by DataStax in DSE and use the Apache Directory API under the hood. It supports

  • hierarchical discovery of roles
  • multiple search bases
  • directory and member-of searches
  • authentication
  • secure TLS connection
  • HA by the configuration of fallback LDAP servers
  • Auto HA configuration using DNS
  • Hostname verification
  • Caching
  • Connection pool for performance

For LDAP-based authentication, the negotiation resembles that for internal authentication. 

Kerberos authentication uses the GSS API and performs client authentication using SASL negotiation. Similar to LDAP, it is based on the implementation offered by DataStax in DSE. 

Token-based access is used to authenticate third-party processes, like Spark Executors, where the authentication information needs to be distributed across those processes. The users usually try to avoid sharing their primary credentials, like usernames and passwords. In such scenarios, the user authenticates with its credentials on the client application, and it generates or asks the server to generate a time-limited authentication token. That token is then shared with the remote processes so that they can connect to Cassandra on behalf of the user. 

A token is aimed to be used for a single application task. It should be removed when that task is completed. 

Due to security considerations, a token has two attributes: the renewal and the expiration time. The renewal time is the latest moment when a token has to be renewed to avoid permanent invalidation. Renewals can go on until the token's expiration time elapses. Beyond that time, renewals are no longer possible; a new token must be created. 

The renewal is performed by the application. It ensures the token gets removed when the application is stopped abruptly and does not invalidate the token on exit. Similarly, the application needs to track the token expiration time and recreate a token until it expires. Recreation is a security requirement - the system with frequently changed access credentials is more challenging to compromise.

Token-based authentication requires storing additional information in a dedicated table. A token entry stores information about the owner, renewer, authorized user, issue date, expiration date, and securely generated random sequence. The user having the required permissions can create, invalidate, renew, and get information about the token using CQL.

The implementation of token authentication uses the SASL DigestMD5 mechanism from Spark and Hadoop. However, token authentication is more like a framework and other authentication schemes can be based on it as well.

All authentication mechanisms have client-side implementations in the DataStax Cassandra Drivers being donated as part of CEP-22.


New or Changed Public Interfaces

TODO

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed. The purpose of this section is to concisely call out the public contract that will come along with this feature.

A public interface is any change to the following:

  • native protocol (and CQL)
  • gossip and the messaging service
  • pluggable components (SPIs) like authorisation, triggers, ..?
  • commitlog, hintlog, cache files
  • sstables components 
  • configuration
  • jmx mbeans (including metrics)
  • monitoring
  • client tool classes
  • command line tools and arguments
  • operational routines
  • Anything else that will likely break existing users in some way when they upgrade

Compatibility, Deprecation, and Migration Plan

The legacy authentication and role management must be supported so that the existing applications can continue to work. To this end, for both authentication and role management the administrator can define the default mechanism which will be used if no negotiation is performed.  If the default mechanism is the previous authentication scheme the upgrade will be seamless.  New mechanisms should not be used until the updated configuration has been enabled on all nodes in the cluster.

Test Plan

TODO

Describe in few sentences how the CEP will be tested. We are mostly interested in system tests (since unit-tests are specific to implementation details). How will we know that the implementation works as expected? How will we know nothing broke?

Rejected Alternatives

TODO

If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.