Status
Current state: Under Discussion
Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]
JIRA: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
AdminClient::describeConfigs
api takes a Collection
of ConfigResource
objects as an argument to get all the configurations of the entities specified. Here is the api signature:
DescribeConfigsResult describeConfigs(Collection<ConfigResource> resources, DescribeConfigsOptions options);
The ConfigResource
class is made up of two fields, name of the entity (“topic-1”, “broker-1” etc) and type of the entity (BROKER, TOPIC etc). DescribeConfigsOptions
allows users to specify whether to fetch config synonyms and their documentation. The api response contains all configurations for the specified entities.
This admin api in turn calls DescribeConfigsRequest
kafka api to get the configuration for specified entities. DescribeConfigsRequest
api takes a collection of DescribeConfigsResource
objects to specify entities whose configuration needs to be fetched. So to make this Kafka API call, AdminClient::describeConfigs
converts ConfigResource
collection passed to it to a DescribeConfigsResource
collection. In addition to name and type of entity whose configuration to get, Kafka DescribeConfigsResource
structure also lets users provide ConfigurationKeys
, a list of String, which allows users to specify only the configurations that they are interested in. As this information isn’t present in the ConfigResource
class, it is set to null
when DescribeConfigsResource
object is created from it. Here is the code doing this (KafkaAdminClient.java)
.map(config -> new DescribeConfigsRequestData.DescribeConfigsResource() .setResourceName(config.name()) .setResourceType(config.type().id()) .setConfigurationKeys(null))
This means that all configurations of all the entities specified are returned by Kafka. Then the user of the AdminClient::describeConfigs
iterates over the returned list and filters out the configuration keys that they are interested in.
This results in boilerplate code for all users of AdminClient::describeConfigs
api, in addition to being wasteful use of resource. It becomes painful for large clusters where to fetch one configuration of all topics, we need to fetch all configuration of all topics, which can result in huge response. Alternatively, request can be batched, but then one api request gets broken down in tens if not hundred of api requests depending on batch size, complicating error handling and retries.
This is also a usability issue when kafka-topics
is used with --describe
option and no topic name is provided. We get all configurations of all the topics, where a user may only be interested in only one or few of the configurations. For kafka-topics
command we also need a way to skip getting configuration so that partition information of all topics can be fetched w/o fetching their configurations.
We need to a way to specify ConfigurationKeys
parameter that DescribeConfigsResource
takes to bring AdminClient::describeConfigs
api to parity with DescribeConfigsResource
and allow AdminClient’s users to specify configuration keys that they are interested in.
In addition we need to add same option to kafka-topics
command line utility so that users of the tool don’t need to fetch all configurations when they are interested in only a few or none of them.
Public Interfaces
We will modify the DescribeConfigsOptions
argument of the AdminClient::describeConfigs
api to take an additional parameter List<String> configurationKeys
, which will be same as one that the DescribeConfigsResource
takes so that AdminClient::describeConfigs
users can specify configurations to fetch. An empty or null value for this will behave as it behaves today by fetching all configuration values. Here is how the new DescribeConfigsOptions
structure will look like after this change:
@InterfaceStability.Evolving public class DescribeConfigsOptions extends AbstractOptions<DescribeConfigsOptions> { private boolean includeSynonyms = false; private boolean includeDocumentation = false; private List<String> configurationKeys; // This is the newly introduced field ... ... public List<String> configurationKeys() { return configurationKeys; } ... ... public void configurationKeys(List<String> configurationKeys) { this.configurationKeys = configurationKeys; } ... }
In addition to this we propose two changes to the kafka-topics
command line tool :
- Allow
--config
option to be specified when using--describe
option. When used in such a way, only the configuration(s) specified will be fetched for the topic. - Add --partition-only option that when used with
–describe
will skip fetching configuration of the topics and will only return partition information.
Proposed Changes
Describe the new thing you want to do in appropriate detail. This may be fairly extensive and have large subsections of its own. Or it may be a few sentences. Use judgement based on the scope of the change.
Compatibility, Deprecation, and Migration Plan
There is no compatibility issue as we are adding a new field to an existing data structure (DescribeConfigsOptions
). If the field is not specified then the code will behave in same manner as existing code. The same applies to changes to the kafka-topics
tool, as we are adding new options and existing options will continue to behave as is.
Rejected Alternatives
If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.