...
Storage will have to be formatted — using the storage format tool — not only when a log directory is added, but also when one is removed from configuration.
meta.properties
The meta.properties version field will be bumped from 1 to 2. Two new properties directory.id
and directory.ids
will be added to the meta.properties
file in each log directory, including the metadata.log.dir
. The first property, directory.id
indicates the UUID for the log directory where the file is located, the second property, directory.ids
lists all the UUIDs for all the configured log directories. If the meta.properties
file doesn't exist for the cluster metadata partition the metadata.log.dir
the Kafka node will fail to start. If the meta.properties
file exist but it doesn't contain these two properties a new one will be generated and the meta.properties
files will be re-writtenupdated. The kafka-storage CLI tool will be extended to generate and write the two properties when the format command is used.
...
- A property named
directory.id
indicting indicating the UUID for the log directory where the meta.properties file is located. The value is base64 encoded, like the cluster UUID. - A property named
directory.ids
indicating the complete list of all UUIDs for each configured log directory. Values are base64 encoded and comma-separated. The order does not matter.
The meta.properties version field will be bumped from 1 to 2.
Having a persisted Having a persisted UUID at the root of each log directory allows the broker to identify the log directory regardless of the mount path.
Having a persisted list of all UUIDs for all configured log directories allows the broker to determine the UUIDs of unavailable (offline) log directories, as the meta.properties files for the offline log directories are likely to be unavailable.allows the broker to identify the log directory regardless of the mount path.
Having a persisted list of all UUIDs for all configured log directories allows the broker to determine the UUIDs of unavailable (offline) log directories, as the meta.properties files for the offline log directories are likely to be unavailable.
Example
Given the following server.properties
:
(... other non interesting properties omitted ...)
process.roles=broker
node.id=8
metadata.log.dir=/var/lib/kafka/metadata
log.dirs=/mnt/d1,/mnt/d2
The command ./bin/kafka-storage.sh format -c /tmp/server.properties --cluster-id 41QSStLtR3qOekbX4ZlbHA
would generate a meta.properties
file that could look like this:
#
#Thu Aug 18 15:23:07 BST 2022
node.id=8
version=2
cluster.id=41QSStLtR3qOekbX4ZlbHA
directory.id=e6umYSUsQyq7jUUzL9iXMQ
directory.ids=e6umYSUsQyq7jUUzL9iXMQ,b4d9ExdORgaQq38CyHwWTA,P2aL9r4sSqqyt7bC0uierg
Each directory, including the directory that holds the cluster metadata topic — metadata.log.dir
— have a different and respective value as the directory ID.
Brokers
Broker lifecycle management
...
- If the partition is associated with log directory UUID with value
Uuid.ZERO
— - If the broker only has one log directory configured, it will place the replica there
- The broker errors, and fails to place the replica if has more than one log directory
- If the partition doesn’t yet exist, it is created in the designated log directory.
- If any partitions already exist, but the hosting log directories do not match the cluster metadata
- If there is a future replica in the log directory indicated by the metadata, the broker will replace the current replica with the future replica
- Otherwise, the broker uses a new RPC —
ASSIGN_REPLICAS_TO_DIRECTORIES
— to the controller to change the metadata association to the actual log directory. The broker will not create the log for the partition until the log directory indicated in the cluster metadata log directory matches the actual log directory.
...