Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

After successfully copied a segment file to remote storage, RLM will append a set of index entries to 3 local index files: remotelogindexremoteLogIndex, remoteoffsetindexremoteOffsetIndex, remotetimeindexremoteTimeIndex. These index files are rotated by RLM at a configurable time interval (or a configurable size).

(active segment)

{log.dirs}/{topic-partition}/0000002400013.index

{log.dirs}/{topic-partition}/0000002400013.timeindex

{log.dirs}/{topic-partition}/0000002400013.log


(inactive segments)

{log.dirs}/{topic-partition}/0000002000238.index

{log.dirs}/{topic-partition}/0000002000238.timeindex

{log.dirs}/{topic-partition}/0000002000238.log

{log.dirs}/{topic-partition}/0000001600100.index

{log.dirs}/{topic-partition}/0000001600100.timeindex

{log.dirs}/{topic-partition}/0000001600100.log


(active remote segment)

{log.dirs}/{topic-partition}/0000001000121.remoteoffsetindexremoteOffsetIndex

{log.dirs}/{topic-partition}/0000001000121.remotetimeindexremoteTimeIndex

{log.dirs}/{topic-partition}/0000001000121.remotelogindexremoteLogIndex


(inactive remote segments)

{log.dirs}/{topic-partition}/0000000512002.remoteoffsetindexremoteOffsetIndex

{log.dirs}/{topic-partition}/0000000512002.remotetimeindexremoteTimeIndex

{log.dirs}/{topic-partition}/0000000512002.remotelogindexremoteLogIndex



Each index entry of the remotelogindex remoteLogIndex file contains the information of a sequence of records in the remote log segment file. The format of a remotelogindex remoteLogIndex entry:

magic: int16 (current magic value is 0)

length: int16 (length of this entry)

crc: int32 (checksum from firstOffset to the end of this entry)

firstOffset: int64 (the Kafka offset of the 1st record)

lastOffset: int64 (the Kafka offset of the last record)

firstTimestamp: int64

lastTimestamp: int64

dataLength: int32 (length of the remote data)

rdiLength: int16

rdi: byte[] (Remote data identifier)

...

Depends on the implementation, RLM may append 1 or more entries to the remotelogindex remoteLogIndex file for each remote segment file. More entries will provide fine-grained indexing of the remote data with the cost of local disk space.

...

Remoteoffsetindex file and remotetimestampindex remoteTimestampIndex file are similar with the existing .index file (offset index) and .timeindex file (timestamp index). The only difference is that they point to the index in the corresponding remotelogindex remoteLogIndex file instead of a log segment file.

...