Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents



...

Status

Current stateUnder DiscussionDone

Discussion thread TBD here

JIRA:   TBD CASSANDRA-17146

Released: <Cassandra Version>4.1-alpha1

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).


...

Scope

todoThe scope of this CEP is to design a framework for easily adding system-wide soft and hard limits.

Goals

  • Easy way to enforce system-wide soft and hard limits to prevent anti-patterns of bad usage and in the long run make it not possible to severely degrade the performance of a node/cluster through user actions (too many MVs/secondary indexes per table, ...), thus increasing stability/availability.
  • As a C* developer it should be easy to add new Guardrails.
  • Guardrails are disabled by default and there should be no overhead when Guardrails are disabled.
  • Guardrails are configured in cassandra.yaml and can also be dynamically modified through JMX and/or virtual tables.
  • Guardrails emit issue warnings/failures to the server log file, and also to the client connection when applicable.
  • Guardrails work in a rolling fashion where some nodes do not have the latest guard rails.

Non-Goals

  • enforcing Enforcing limits on a per-user-basis
  • setting limits dynamically while nodes are running
  • .

Timeline

todo

Mailing list / Slack channels

...

  • Guardrail: Interface defining a guardrail that guards against a particular usage/condition.
  • DefaultGuardrail: Abstract class implementing Guardrail. It implements the default behaviour when the guardrail is triggered consisting on throwing warnings or errors.
  • GuardrailsFactory: Interface defining a factory for building instances of Guardrail.
  • DefaultGuardrailsFactory: Class implementing GuardrailsFactory, it builds instances of DefaultGuardrail.
  • CustomGuardrailsFactory: Abstract class instantiating a custom GuardrailsFactory, so users can provide their own implementations of guardrails through a system property named cassandra.custom_guardrails_factory_class.
  • GuardrailsConfig: Configuration settings for Guardrails, which are populated from cassandra.yaml . This contains a main setting enabled, controlling if Guardrails are globally active or not, and individual settings to control each Guardrail.
  • cassandra.yaml: allows configuring individual Guardrailsguardrails at startup, being globally disabled by default. These guardrails will also be dynamically configurable through JMX and/or virtual tables.
  • Guardrails: Entry point for guardrails, storing all the defined guardrail instances and additional helper methods. These Guardrail instances are built at startup with the provided GuardrailsFactory and GuardrailsConfig.

...

Code Block
languagetext
themeRDark
titlecassandra.yaml settings
# guardrails:
#  # enabled: false
#  # column_value_size_failure_threshold_in_kb: -1
#  # columns_per_table_failure_threshold: -1
#  # secondary_index_per_table_failure_threshold: -1
#  # materialized_view_per_table_failure_threshold: -1
#  # tables_warn_threshold: -1
#  # tables_failure_threshold: -1
#  # table_properties_disallowed: 
#  # write_consistency_levels_disallowed:
#  # partition_size_warn_threshold_in_mb: -1
#  # partition_keys_in_select_failure_threshold: -1
#  # disk_usage_percentage_warn_threshold: -1
#  # disk_usage_percentage_failure_threshold: -1
#  # in_select_cartesian_product_failure_threshold: -1
#  # user_timestamps_enabled: true
#  # read_before_write_list_operations_enabled: true
#  # fields_per_udt_failure_threshold: -1
#  # collection_size_warn_threshold_in_kb: -1
#  # items_per_collection_warn_threshold: -1

It will also be possible to dynamically configure guardrails at runtime through JMX and/or virtual tables.

Migrating existing cassandra.yaml warn/fail thresholds

...

  • tombstone_warn_threshold
  • tombstone_failure_threshold
  • batch_size_warn_threshold_in_kb
  • batch_size_fail_threshold_in_kb
  • unlogged_batch_across_partitions_warn_threshold
  • compaction_large_partition_warning_threshold_mb
  • coordinator_large_read
  • local_read_size
  • row_index_size

Distinction from Capability Restrictions

Guardrails allow C* operators to impose system-wide restrictions that are configured through yaml and JMX/virtual tables. Capability restrictions are focused on imposing restrictions on particular users and offer a new CQL API to do so. Both concepts are not mutually exclusive and are complementary, and capabilities restrictions can be developed either independently or as an extension of the proposed system-wide guardrails.

Event logging

In their initial form, Guardrails would issue warnings/failures to the server log file, and also to the client connection. Some guardrails can be used in processes that are not linked to a client connection, such as compaction. In that case the warnings/failures would be issued only to the server log file. As for propagating warnings and failures to clients, we might require to do some changes in drivers, so they are able to display detailed messages.

Most guardrails will be triggered on the coordinator node, but some will be triggered on replicas. For those guardrails we will need to use internal messaging to still be able to notify clients, similarly to what has been done in CASSANDRA-16850/CASSANDRA-16896.

client connection when applicable. It would make sense to also emit such guardrail triggering events as Diagnostic Events to help troubleshooting these issues. Emitting diagnostic events is an idea for the future and it is not part of this CEP.

...