Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Namespaces will allow users to create topics with same name as long as they are part of different namespaces.
  • No need to set configs for each topic individually
  • Allow bootstrapping any new entity in a namespace with some default configs, which is set for that particular namespace, and then letting each entity override parts of that config.
  • Similar to configs, acls can be set at namespace level, which is by default inherited by underlying entities.
  • When Kafka decides to support at rest encryption, having namespaces at logs level will allow encrypting different namespaces with different keys.

...

  • Binary log format

    • No changes expected

  • The network protocol and api behavior

    • No changes expected

  • Any class in the public packages under clientsConfiguration, especially client configuration

    • org/apache/kafka/common/serialization

      • No changes expected

    • org/apache/kafka/common

      • No changes expected

    • org/apache/kafka/common/errors

      • InvalidNamespaceException will be added to indicate the namespace user is trying to use does not exist.

    • org/apache/kafka/clients/producer

      • No changes expected

    • org/apache/kafka/clients/consumer (eventually, once stable)

      • No changes expected

  • Monitoring

  • Command line tools and arguments

    • Add create, list, move and delete for namespaces to kafka-topics and AdminUtils.

    • An optional "namespace" argument will be added to kafka-topics, kafka-configs and kafka-acls. If namespace argument is not provided, a default namespace of "" will be used. This will help in keeping the current behavior of Kafka and cli tools intact.

  • Anything else that will likely break existing users in some way when they upgrade
    • None

...

After considering a few approaches, listed in Rejected Alternatives section, below is what we think is the least obtrusive approach to support namespaces in Kafka. We suggest to represent namespaces at storage layer, i.e., storage layout of Zookeeper entities and logs on disk. Internal and public APIs can pass around namespaces, as part of, prepended to, topic names. However, we need to separate namespace and topic while interacting with storage layers. This can be done by using a delimiter character that is not allowed in Kafka topics. Currently, Kafka allows a topic name to contain characters only in [a-zA-Z0-9\\._\\-]. That gives us a few options to decide on the delimiting char. We suggest to have ":" as the delimiting char, but it can be any of the following.

...

Namespace can have any char in [a-zA-Z0-9\\.\\-]. The "." in namespaces will be used to separate tiers. For instance, a namespace "org.apache.kafka" will translate to "org/apache/kafka" in storage layouts.

...

The goal here is to make sure that any existing topic, producers and consumers, without any namespace, continues to work as expected. All topics under /broker/topics will be part of default namespace, i.e., "". Any topic created without specifying a namespace will be part of the default namespace. As long as users do not specify namespaces in their request or cli commands, things should work just as before. 

...

Regexes should work fine, however users might have to modify their existing regexes based on topics and namespaces that exist in on the cluster. If someone, like mirrormakermirror-maker, is subscribing to ".*", then this will just work fine and nothing has to be changed. However, if someone user has regex like "bla*" and we end up having the user later creates a namespace "bla", then they will get subscribed to topics in namespace "bla" that they were probably not expecting.

...

  1. Just prepend namespace to topic names, even at storage layer, instead of creating a hierarchy of directories to represent namespaces. 
    1. Will not enable encrypting namespaces with different keys.
    2. Namespace level configs or acls won't be possible without creating yet another znode in Kafka's chroot in ZK. I do not think having /kafka-namespaces/<namespace> will help in alleviating the confusion caused by passing namespace and topic name together in APIs, as we will still have to represent multi-tiered namespaces under /kafka-namespaces as dirs.
  2. Modify request/ response formats to take namespace specifically.
    1. Solves the issue of delimiting string required in proposed approach and the issue with existing regex consumers.
    2. This definitely is the cleanest approach. However, will require lots of API and protocol changes.
  3. Manage namespaces separately. 
    1. This will still have the issue of topic name collisions even if they belong to separate namespaces.
  4. Add namespace to session object.
    1. Will avoid each request and response to have namespace with topic name, however this probably is violating separation of concerns.
  5. To have delimiter char configurable.
    1. Will add yet another config, without a any clear gain.