...
Vote will be changed to replace topic name with topic ID, and will use a sentinel topic ID if no topic ID has been assigned already. See Compatibility with KIP-500 for more information on sentinel topic IDs.
VoteRequest v0
|
VoteResponse v0
|
BeginQuorumEpoch
BeginQuorumEpoch will replace the topic name field with the topic id field
BeginQuorumEpochRequest v0
|
BeginQuorumEpochResponse v0
|
EndQuorumEpoch
EndQuorumEpoch will replace the topic name field with the topic id field
EndQuorumEpochRequest v0
|
EndQuorumEpochResponse v0
|
DeleteTopics
With the addition of topic IDs and the changes to LeaderAndIsrRequest described above, we can now make changes to topic deletion logic that will allow topics to be immediately considered deleted, regardless of whether all replicas have responded to a DeleteTopicsRequest.
...
Using a topic ID will result in a slightly smaller fetch request and likely prevent further changes. Assigning a unique ID for the metadata topic leaves the possibility for the topic to be placed in tiered storage, or used in other scenarios where topics from multiple clusters may be in one place without appending the cluster ID.
Sentinel ID
The idea is that this will be a hard-coded UUID that no other topic can be assigned. Initially the all zero UUID was considered, but was ultimately rejected since this is used as a null ID in some places and it is better to keep these usages separate. An example of a hard-coded UUID is 00000000-0000-0000-0000-000000000001
Tooling
kafka-topics.sh --describe will be updated to include the topic ID in the output. A user can specify a topic name to describe with the --topic parameter, or alternatively the user can supply a topic ID with the --topic_id parameter
...
Ideally, consumer offsets stored in the __consumer_offsets topic would be associated with the topic ID for which they were read. However, given the way the __consumer_offsets is compacted, this may be difficult to achieve in a forwards compatible way. This change will be left until topic IDs are implemented in the clients. Another future improvement opportunity is to use topicId in GroupMetadataManager.offsetCommitKey in the offset_commit topic. This may save some space.
Persisting Topic IDs
A few other alternatives to the partition metadata file were considered. One topic of discussion was whether it was necessary to include at all. With the current decision of maintaining the topic name in the directory, the only way to persist the topic ID to disk is through a file. The decision against changing the directory is discussed below.
Another alternative is to have a single file mapping all topic names to ids. Although this could be useful for tooling, it would be harder to maintain this file and update on each new topic added.
log.dir layout
It would be ideal if the log.dir layout could be restructured from {topic}_{partition} format to {{topicIdprefix}}/{topicId}_{partition}, e.g. "mytopic_1" → "24/24cc4332-f7de-45a3-b24e-33d61aa0d16c_1". Note the hierarchical directory structure using the first two characters of the topic ID to avoid having too many directories at the top level of the logdir. This change is not required for the topic deletion improvements above, and will be left for a future KIP where it may be required e.g. topic renames.
Changing the directory structure in this way would also require more changes to tooling. Finding the correct log directory for a given topic will require more work for the user with the current changes in the KIP. There are other considerations when it comes to changing the directory structure, so it is probably best to spend more time before we commit to a decision.
Security/Authorization
One idea was to support authorizing a principal for a topic ID rather than a topic name. For now, this would be a breaking change, and it would be hard to support prefixed ACLs with topic IDs.