Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

No actual public interface change. The search by timestamp function will still be provide provided by OffsetRequest.

Proposed Changes

...

The time index file needs to be built just like the log index file based on each log segment file.

Use a time index for each log segment to save the timestamp -> log offset at

...

a configurable granularity

Create another index file for each log segment with name SegmentBaseOffset.time.index to have index at minute a configurable level. The time index entry format is:

 

Code Block
languagejava
Time Index Entry => Timestamp Offset
  Timestamp => int64
  Offset => int32

The time index granularity does not change the actual timestamp searching granularity. It only affects the time needed for searching. The way it works will be the same as offset search - find the closet timestamp and corresponding offset, then start the leaner scan over the log until find the target message. The reason we prefer minute level indexing is Although the granularity is configurable, it is recommended to have a minute level granularity because timestamp based search is usually rare so it probably does not worth investing significant amount of memory in it.

The following table give the summary of memory consumption using different granularity. The number is calculated based on a broker with 3500 partitions.

...

On broker startup, the broker will need to find the latest timestamp of the current active log segment. The latest timestamp may needed for the next log index append. So the broker will need to scan backward from the current active log segment to find earlier log segment until it finds the latest timestamp of messages.

Enforce time based log retention

...

The time index file needs to be built just like the log index file based on each log segment file.

Use a time index for each log segment to save the timestamp -> log offset at minute granularity

Create another index file for each log segment with name SegmentBaseOffset.time.index to have index at minute level. The time index entry format is:

...