You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Next »

Status

Current state"Under Construction (smile)"

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: here [Change the link from KAFKA-1 to your own ticket]

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Apache Kafka is rapidly finding its place in data heavy organizations as a fault-tolerant message bus. One of the goals of Kafka is data integration, which makes it important to support many users in one Kafka system. With increasing adoption and user community, support for multi-tenancy is becoming a popular demand. There have been a few discussions on Apache Kafka’s mailing lists regarding the same, indicating importance of the feature. Namespaces will allow/ enable many functionalities that require logical grouping of topics. If you think topic as a SQL table, then namespace is a SQL database that lets you group tables together. Following are a few use cases.

  • Namespaces will allow users to create topics with same name as long as they are part of different namespaces.
  • Allow bootstrapping any new entity in a namespace with some default configs, which is set for that particular namespace, and then letting each entity override parts of that config.
  • Similar to configs, acls can be set at namespace level, which is by default inherited by underlying entities.
  • When Kafka decides to support at rest encryption, having namespaces at logs level will allow encrypting different namespaces with different keys.
  • Enables namespace level management.

Public Interfaces

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed. The purpose of this section is to concisely call out the public contract that will come along with this feature.

A public interface is any change to the following:

  • Binary log format

    • No changes expected

  • The network protocol and api behavior

    • No changes expected

  • Any class in the public packages under clientsConfiguration, especially client configuration

    • org/apache/kafka/common/serialization

      • No changes expected

    • org/apache/kafka/common

      • No changes expected

    • org/apache/kafka/common/errors

      • InvalidNamespaceException will be added.

    • org/apache/kafka/clients/producer

      • No changes expected

    • org/apache/kafka/clients/consumer (eventually, once stable)

      • No changes expected

  • Monitoring

  • Command line tools and arguments

    • Add create, list and delete for namespaces to kafka-topics and AdminUtils.

    • An optional "namespace" argument will be added to kafka-topics and kafka-configs. If namespace argument is not provided, a default namespace of "" will be used. This will help in keeping the current behavior of Kafka and cli tools intact.

  • Anything else that will likely break existing users in some way when they upgrade
    • None

Proposed Changes

After considering a few approaches, listed in Rejected Alternatives section, below is what we think is the least obtrusive approach to support namespaces in Kafka. We suggest to represent namespaces at storage layer, i.e., storage layout of Zookeeper entities and logs on disk. Internal and public APIs can pass around namespaces, as part of, prepended to, topic names. However, we need someway to separate namespace and topic. This can be done by using a delimiter character that is not allowed in Kafka topics. Kafka allows a topic name to contain characters only from [a-zA-Z0-9\\._\\-]. We suggest to have ":" as the delimiting char, but can be any of the following.

Possible Delimiters

  1. <namespace>:<topic>
  2. <namespace>%<topic>
  3. <namespace>@<topic>
  4. <namespace>|<topic>
  5. <namespace>#<topic>
  6. <namespace>*<topic>
  7. <namespace>~<topic>
  8. <namespace>$<topic>
  9. <namespace>^<topic>
  10. <namespace>&<topic>
  11. <namespace>><topic>

Multi-tiered Namespaces

Namespace can have any char in [a-zA-Z0-9\\.]. The "." in namespaces will be used to separate tiers. For instance, a namespace "org.apache.kafka" will translate to "org/apache/kafka" in storage layouts.

Create

kafka-topics will be modified to support creating namespaces.

kafka-topics.sh --zookeeper <ZK_CONNECTION_STRING> --create --namespace <NAMESPACE> 
List
Delete

Changes in Storage Layouts

Following are the changes in layout structures on ZK and disks.

TypeExisting layoutProposed layout
ZK /brokers/topics/<topic>/brokers/topics/<namespace>/<topic>
ZK/configs/<entity_name>/<entity>/configs/<namespace>/<entity_name>/<entity>
ZK/admin/delete_topics/<topic>/admin/delete_topics/<namespace>/<topic>
ZK/kafka_acls/<entity_name>/<entity> /kafka_acls/<namespace>/<entity_name>/<entity>
Disk/<log_dir>/<topic>_<partition>/<log_dir>/namespace/<topic>_<partition>

By default, namespace will be empty string. All existing entities will be part of default namespace and the current storage layouts will be in accordance with the proposed storage layouts.

Pros
  1. Will not change existing behavior.
  2. No impact on upgrades.
Cons
  1. Namespace needs to be parsed out of topic names and that requires some demarcating string or character. We suggest to make this string configurable at cluster level with "::" as default.
  2. Longer topic names in requests and responses.

Changes in AdminUtils

TODO

Changes in AclCommand

TODO

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?
    • If the users set up the demarcation string to a string that appears in some existing topics, then this will lead to unexpected behavior.
  • If we are changing behavior how will we phase out the older behavior?
    • No need to phase out old behavior.
  • If we need special migration tools, describe them here.
    • NA
  • When will we remove the existing behavior?
    • NA

Rejected Alternatives

Following are a few alternatives that we considered.

  1. Just prepend namespace to topic names, inheritance will be tricky and not so intuitive. Will not enable encrypting namespaces with different keys or namespace level management.
  2. Manage namespaces separately, this will still have the issue of topic name collisions even if they belong to separate namespaces.
  3. Modify request/ response formats to take namespace specifically. Solves the issue of demarcation string required in proposed approach. However, will be backwards incompatible change.
  4. Add namespace to session object. Will avoid each request and response to have namespace with topic name, however this probably is violating separation of concerns.
  5. To have delimiter char configurable.
  • No labels