Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

* An interface will be introduced using which different RLM implementations that perform integration with different types of remote storages can be made available. RLM implementation code can also be kept outside Kafka core if the community chooses so.
* RLM has two modes:
* RLM Leader - In this mode, RLM that is the leader for topic-partition, checks for rolled over LogSegments and copies it along with OffsetIndex to the remote tier. RLM creates an index file, called RemoteLogSegmentIndex, per topic-partition to track remote LogSegments. Additionally, RLM leader also serves the read requests for older data from the remote tier.
* RLM Follower - In this mode, RLM keeps track of the segments and index files on remote tier and updates its RemoteLogSegmentIndex file per topic-partition. RLM follower does not serve reading old data from the remote tier.


Core Kafka changes

To satisfy the goal of keeping Kafka changes minimal when RLM is not configured, Kafka behavior remains unchanged for existing users.

...

* Core Kafka starts RLM service if tiered storage is configured
* When an offset index is not found, if RLM is configured, the read request is delegated to RLM to serve the data from the remote tier.


Serving Data from Remote Storage


 Approach 1

RLM will ship all the LogSegments and the corresponding OffsetIndex to RemoteStorage. A new index file, **<code>RemoteLogSegmentIndex,</code></strong> is maintained locally on the Kafka broker per topic-partition like all the existing index files are stored today as shown below:   

...

Code Block
FileName : 00000001000121.remoteindex 

Contents:
SegmentStartOffset
1000121 
1500024 
2000011 
…
2999999

FileName : 00000003000000.remoteindex 
Contents:
SegmentStartOffset
3000000 
3500024 
4000011 
…




RLM

...

maintains

...

these

...

RemoteLogSegmentIndexes

...

per

...

topic-partition

...

in

...

local

...

files

...

on

...

the

...

Kafka

...

broker.

...

These

...

files

...

are

...

rolled

...

on

...

a

...

periodic

...

basis

...

with

...

starting

...

index

...

of

...

first

...

LogSegment

...

in

...

the

...

file

...

name.

...

Note

...

that

...

the

...

RemoteLogSegmentIndex

...

can

...

be

...

constructed

...

by

...

listing

...

all

...

the

...

log

...

segments

...

stored

...

on

...

the

...

remote

...

storage.

...

Maintaining

...

a

...

local

...

file

...

is

...

an

...

optimization

...

to

...

avoid

...

such

...

listing

...

operations

...

that

...

may

...

be

...

slow

...

and

...

expensive

...

depending

...

on

...

the

...

external

...

store.

...

RemoteLogSegmentIndex

...

files

...

are

...

MMAP'ed

...

files

...

and

...

will

...

follow

...

a

...

similar

...

binary

...

search

...

mechanism

...

as

...

OffsetIndex

...

files

...

to

...

find

...

a

...

LogSegment

...

to

...

serve

...

a

...

read

...

operation.

On `OutOfRangeOffsetException`, ReplicaManager delegates the read request to RLM which does the following:

...

1. RLM performs a binary search on the memory mapped `RemoteLogSegmentIndex` file to find the starting offset of a LogSegment that has the requested offset.
1. RLM uses starting offset to build the file names (or object names) of LogSegment and OffsetIndex.
1. RLM fetches the segment and offset files on demand and seeks into LogSegment for the requested offset and serves the data to the client.


Approach 2

As in approach 1, RLM stores LogSegment and OffsetIndex in remote storage. In addition, it also stores an additional copy of OffsetIndex files in the local storage to avoid reading the offsets from the remote storage. This approach will efficiently seek ahead of time instead of fetching the entire LogSegment file from the remote.

...