Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Clarify and add example of meta.properties changes

...

Storage will have to be formatted — using the storage format tool — not only when a log directory is added, but also when one is removed from configuration.

meta.properties

The meta.properties version field will be bumped from 1 to 2. Two new properties directory.id and directory.ids will be added to the meta.properties file in each log directory, including the metadata.log.dir . The first property, directory.id indicates the UUID for the log directory where the file is located, the second property, directory.ids  lists all the UUIDs for all the configured log directories. If the meta.properties  file doesn't exist for the cluster metadata partition the metadata.log.dir  the Kafka node will fail to start. If the meta.properties  file exist but it doesn't contain these two properties a new one will be generated and the meta.properties  files will be re-writtenupdated. The kafka-storage CLI tool will be extended to generate and write the two properties when the format command is used.

...

  • A property named directory.id indicting  indicating the UUID for the log directory where the meta.properties file is located. The value is base64 encoded, like the cluster UUID.
  • A property named directory.ids indicating the complete list of all UUIDs for each configured log directory. Values are base64 encoded and comma-separated. The order does not matter.

The meta.properties version field will be bumped from 1 to 2.

Having a persisted Having a persisted UUID at the root of each log directory allows the broker to identify the log directory regardless of the mount path.
Having a persisted list of all UUIDs for all configured log directories allows the broker to determine the UUIDs of unavailable (offline) log directories, as the meta.properties files for the offline log directories are likely to be unavailable.allows the broker to identify the log directory regardless of the mount path.
Having a persisted list of all UUIDs for all configured log directories allows the broker to determine the UUIDs of unavailable (offline) log directories, as the meta.properties files for the offline log directories are likely to be unavailable.

Example

Given the following server.properties:

(... other non interesting properties omitted ...)
process.roles=broker
node.id=8
metadata.log.dir=/var/lib/kafka/metadata
log.dirs=/mnt/d1,/mnt/d2

The command ./bin/kafka-storage.sh format -c /tmp/server.properties --cluster-id 41QSStLtR3qOekbX4ZlbHA  would generate a meta.properties  file that could look like this:

#
#Thu Aug 18 15:23:07 BST 2022
node.id=8
version=2
cluster.id=41QSStLtR3qOekbX4ZlbHA
directory.id=e6umYSUsQyq7jUUzL9iXMQ
directory.ids=e6umYSUsQyq7jUUzL9iXMQ,b4d9ExdORgaQq38CyHwWTA,P2aL9r4sSqqyt7bC0uierg

Each directory, including the directory that holds the cluster metadata topic — metadata.log.dir  — have a different and respective value as the directory ID.

Brokers

Broker lifecycle management

...

  • If the partition is associated with log directory UUID with value Uuid.ZERO
    • If the broker only has one log directory configured, it will place the replica there
    • The broker errors, and fails to place the replica if has more than one log directory
  • If the partition doesn’t yet exist, it is created in the designated log directory.
  • If any partitions already exist, but the hosting log directories do not match the cluster metadata
    • If there is a future replica in the log directory indicated by the metadata, the broker will replace the current replica with the future replica
    • Otherwise, the broker uses a new RPC — ASSIGN_REPLICAS_TO_DIRECTORIES — to the controller to change the metadata association to the actual log directory. The broker will not create the log for the partition until the log directory indicated in the cluster metadata log directory matches the actual log directory.

...