Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The ConfigResource class is made up of two fields, name of the entity (“topic-1”, “broker-1” etc) and type of the entity (BROKER, TOPIC etc). DescribeConfigsOptions allows use users to specify whether to fetch config synonyms and their documentation. The api response contains all configurations for the specified entities.

...

This results in boilerplate code for all users of AdminClient::describeConfigs api, in addition to  being wasteful use of resource. It becomes painful for large clusters where to fetch one configuration of all topics, we need to fetch all configuration of all topics, which can result in huge response. Alternatively, request can be batched, but then one api request gets broken down in tens if not hundred of api request requests depending on batch size, complicating error handling and retries.

This is also a usability issue when running kafka-configs command which returns all configurations and then user need to either scan or filter for property that they are interested in-topics is used with --describe option and no topic name is provided. We get all configurations of all the topics, where a user may only be interested in only one or few of the configurations. For kafka-topics command we also need a way to skip getting configuration so that partition information of all topics can be fetched w/o fetching their configurations.

We need to a way to specify ConfigurationKeys parameter that DescribeConfigsResource takes to bring AdminClient::describeConfigs api to parity with DescribeConfigsResource and allow AdminClient’s users to specify configuration keys that they are interested in.

In addition we need to add same option to kafka-configs command topics command line utility so that users of the tool don’t need to fetch all configurations when they are interested in only a few or none of them.

Public Interfaces

Briefly list any new interfaces that will be introduced as part of this proposal or any existing interfaces that will be removed or changed. The purpose of this section is to concisely call out the public contract that will come along with this feature.

A public interface is any change to the following:

  • Binary log format

  • The network protocol and api behavior

  • Any class in the public packages under clientsConfiguration, especially client configuration

    • org/apache/kafka/common/serialization

    • org/apache/kafka/common

    • org/apache/kafka/common/errors

    • org/apache/kafka/clients/producer

    • org/apache/kafka/clients/consumer (eventually, once stable)

  • Monitoring

  • Command line tools and arguments

  • Anything else that will likely break existing users in some way when they upgrade

Proposed Changes

...

We will modify the DescribeConfigsOptions argument of the AdminClient::describeConfigs api to take an additional parameter List<String> configurationKeys, which will be same as one that the DescribeConfigsResource  takes so that AdminClient::describeConfigs users can specify configurations to fetch. An empty or null value for this will behave as it behaves today by fetching all configuration values. Here is how the new DescribeConfigsOptions structure will look like after this change:


Code Block
@InterfaceStability.Evolving
public class DescribeConfigsOptions extends AbstractOptions<DescribeConfigsOptions> {

    private boolean includeSynonyms = false;
    private boolean includeDocumentation = false;
    private List<String> configurationKeys = null;   // This is the newly introduced field
...
...
    public List<String> configurationKeys() {
      return configurationKeys;
    }
...
...
    public void configurationKeys(List<String> configurationKeys) {
      this.configurationKeys = configurationKeys;
    }
...
}


In addition to this we propose two changes to the kafka-topics  command line tool :

  1. Allow --config  option to be specified when using --describe option.  When used in such a way, only the configuration(s) specified will be fetched for the topic.
  2. Add --partition-only option that when used with –describe  will skip fetching configuration of the topics and will only return partition information.

Proposed Changes

Previous section lists the most of the changes that are needed. On the implementation side, when we convert DescribeConfigsOptions argument to Kafka apis DescribeConfigsResource object, we will fetch the list of configuration keys from the DescribeConfigsOptions and put that in the DescribeConfigsResource, so the new code will look like this:

Code Block
.map(config ->
    new DescribeConfigsRequestData.DescribeConfigsResource()
        .setResourceName(config.name())
        .setResourceType(config.type().id())
        .setConfigurationKeys(options.configurationKeys()))  // This is the change, instead of passing `null` we will pass the keys specified by user


Couple of things to note here:

  1. A user can specify a configuration key that isn't valid for the resource type specified. In this case the response will be empty and no configuration will be returned. This matches current behavior of DescribeConfigsRequest Kafka Api. In that sense AdminClient will just reflect the existing behavior of underlying Kafka api and will not be modifying it.
  2. A user can specify a mix of resource types as first argument to AdminClient::describeConfigs api, i.e. the resources  argument can contain a resource of BROKER  type and also a resource of TOPIC  type. When fetching all configurations this behavior has no significance. However when user specifies configurations to fetch, then unless somehow the configuration key happen to exist for all specified resources, all except one of the resource will have empty configurations returned. In that sense this new feature is mostly useful when clients of AdminClient::describeConfigs are fetching configuration for resource of only one type.

Compatibility, Deprecation, and Migration Plan

  • What impact (if any) will there be on existing users?
  • If we are changing behavior how will we phase out the older behavior?
  • If we need special migration tools, describe them here.
  • When will we remove the existing behavior?

Rejected Alternatives

There is no compatibility issue as we are adding a new field to an existing data structure (DescribeConfigsOptions). If the field is not specified then the code will behave in same manner as existing code. The same applies to changes to the kafka-topics tool, as we are adding new options and existing options will continue to behave as is.

Rejected Alternatives

We considered adding configurationKeys collection directly to the ConfigResource object so that user can specify different type of resources and for each one of them they can also specify the configurations to fetch. This avoid the issue #2 mentioned above in the Proposed Changes  section. However, ConfigResource class is widely used outside of this api (over 300 usages) and is meant to represent a resource that has configurations. The new configuraiton key field will be of no use in all these other cases, so we decided to not pursue this approach.  If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.