...
Proposed Changes
High-level design
RemoteLogManager (RLM) is a new component that keeps track of remote log segments
...
Code Block | ||
---|---|---|
| ||
def readFromLocaLog(): Seq[(TopicPartition, LogReadResult)] = { catch { case e@ (_: OffsetOutOfRangeException) => RemoteLogManager.read(fetchMaxBytes: Int, hardMaxBytesLimit: Boolean, tp: TopicPartition, fetchInfo: PartitionData quota: ReplicaQuota) } |
Log Retention
Log retention will continue to work like earlier, local log segments are deleted only when the respective log segments are copied over to remote storage successfully even though their local retention time/size is reached.
Consumer Fetch Requests
For any fetch requests, ReplicaManager will proceed with making a call to readFromLocalLog, if this method returns OffsetOutOfRange exception it will delegate the read call to RemoteLogManager.readFromRemoteLog and returns the LogReadResult. If the fetch request is from a consumer, RLM will read the data from remote storage, and return the result to the consumer.
...
To achieve this, we have to handle remote storage operations in dedicated threads pools, instead of Kafka I/O threads and fetcher threads.
1. Remote
...
Log Manager (RLM) Thread Pool
RLM maintains a list of the topic-partitions it manages. The list is updated in Kafka I/O threads, when topic-partitions are added to / removed from RLM. Each topic-partition in the list is assigned a scheduled processing time. The RLM thread pool processes the topic-partitions that the "scheduled processing time" is less than or equal to the current time.
...