...
For follower replicas, it maintains metadata cache by subscribing to the respective remote log metadata topic partitions. Whenever a topic partition is reassigned to a new broker and RLMM on that broker is not subscribed to the respective remote log metadata topic partition then it will subscribe to the respective remote log metadata topic partition and adds all the entries to the cache. So, in the worst case, RLMM on a broker may be consuming from most of the remote log metadata topic partitions. This requires the cache to be based on disk storage like RocksDB to avoid a high memory footprint on a broker. This will allow us to commit offsets of the partitions that are already read. Committed offsets can be stored in a local file to avoid reading the messages again when a broker is restarted.
Configs
remote.log.metadata.topic.replication.factor | Replication factor of the topic Default: 3 |
remote.log.metadata.topic.partitions | No of partitions of the topic Default: 50 |
remote.log.metadata.topic.retention.ms | Retention of the topic in milli seconds Default: 365 * 24 * 60 * 60 * 1000 (1 yr) |
remote.log.metadata.manager.listener.name | Listener name to be used to connect to the local broker by RemoteLogMetadataManager implementation on the broker. This is used by kafka clients created in RemoteLogMetadataManager implementation. |
remote.log.metadata.* | Any other properties should be prefixed with "remote.log.metadata." and these will be passed to RemoteLogMetadataManager implementation. For ex: Security configuration to connect to the local broker for the listener name configured. |
[We will add more details later about how the resultant state for each topic partition is computed ]
...
Compacted topics will not have remote storage support.
Configs
System-Wide | remote.log.storage.enable - Whether to enable remote log storage or not. Valid values are `true` or `false` and the default value is false. This property gives backward compatibility. remote.log.storage.manager.class.name - This is mandatory if the remote.log.storage.enable is set as true. remote.log.metadata.manager.class.name(optional) - This is an optional property. If this is not configured, Kafka uses an inbuilt metadata manager backed by an internal topic. |
RemoteStorageManager | (These configs are dependent on remote storage manager implementation) remote.log.storage.* |
RemoteLogMetadataManager | (These configs are dependent on remote log metadata manager implementation) remote.log.metadata.* |
Thread pools | remote.log.manager.thread.pool.size remote.log.manager.task.interval.ms remote.log.reader.threads remote.log.reader.max.pending.tasks |
Per Topic Configuration | Below retention configs are similar to the local log retention. This configuration is used to determine how long the remote log segments are to be retained in the remote storage. remote.log.retention.ms remote.log.retention.minutes remote.log.retention.hours remote.log.retention.bytes |
...