Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Compacted topics will not have remote storage support. 

Configs

System-Wide

remote.log.storage.enable

Whether to enable remote log storage or not. Valid values are `true` or `false`.

remote.log.storage.manager.class.name =  org.apache.kafka.rsm.hdfs.HDFSRemoteStorageManager

Not configuring the above property gives backward comaptibility. 

RemoteStorageManager

(These configs are dependent on remote storage manager implementation)

remote.log.storage.*

Thread pools

remote.log.manager.thread.pool.size
Remote log thread pool size, which is used in scheduling tasks to copy segments, fetch remote log indexes and clean up remote log segments.

remote.log.manager.task.interval.ms
Interval at which remote log manager runs the scheduled tasks like copy segments, fetch remote log indexes and clean up remote log segments.

remote.log.reader.threads
Remote log reader thread pool size

remote.log.reader.max.pending.tasks
Maximum remote log reader thread pool task queue size. If the task queue is full, broker will stop reading remote log segments.

Per Topic Configuration

remote.log.retention.minutes

remote.log.retention.bytes

...

  • RLM Leader Task - It checks for rolled over LogSegments (which have last message offset less than last stable offset of that topic partition) and copies them along with their remote offset/time indexes to the remote tier. RLM creates an index file, called RemoteLogSegmentIndex, per topic-partition to track remote LogSegments. These indexes are described in detail here. It also serves the fetch requests for older data from the remote tier. Local logs are not cleanedup till those segments are copied successfully to remote even though their retention time/size is reached.
  • RLM Follower Task - it keeps track of the segments and index files on remote tier and updates its RemoteLogSegmentIndex file per topic-partition. Local logs are not cleanedup till their remote log indexes are copied locally from remote storage even though their retention time/size is reached. RLM follower can also serve reading old data from the remote tier.

...

This is explained in detail here.

Replica Manager

If RLM is configured, ReplicaManager will call RLM to assign topic-partitions or remove topic-partitions similar to how the replicaFetcherManager works today.

...

For any fetch requests, ReplicaManager will proceed with making a call to readFromLocalLog, if this method returns OffsetOutOfRange exception it will delegate the read call to RemoteLogManager.readFromRemoteLog and returns the LogReadResult. More details are explained in the RLM/RSM tasks section.

Follower Requests/Replication

...