Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. When a new log segment is created, the broker will create a time index file for the log segment. 
  2. The default initial / max size of the time index files is the same as the offset index files. (time index entry is 1.5x of the size of offset index entry, user should set the configuration accordingly).
  3. Each log segment maintains the largest timestamp so far in that segment. The initial value of the largest timestamp is -1 for a newly created segment.
  4. When broker receives a message, if the message is not rejected due to timestamp exceeds threshold, the message will be appended to the log. (The timestamp will either be LogAppendTime or CreateTime depending on the configuration)
  5. When broker appends the message to the log segment, if an offset index entry is inserted, it will also insert a time index entry if the max timestamp so far is greater than the timestamp in the last time index entry.
    • For message format v0, the timestamp is always -1, so no time index entry will be inserted when message is appended.
  6. When the current active segment is rolled out or closed. A time index entry will be inserted into the time index to ensure the last time index entry has the largest timestamp of the log segment.
    1. If largest timestamp in the segment is non-negative (at least one message has a timestamp), the entry will be (largest_timestamp_in_the_segment -> base_offset_of_the_next_segment)
    2. If largest timestamp in the segment is -1 (No message in the segment has a timestamp), the entry will be (last_modification_time_of_the_segment -> base_offset_of_the_next_segment) 

The time index is not globally monotonically increasing. Instead, it is only monotonically increasing within each time index file. i.e. It is possible that the time index file for a later log segment contains smaller timestamp than some timestamp in the time index file of an earlier segment. 

...

On broker startup, the latest timestamp is needed for the next log index append. The broker will find the largest timestamp by looking at the last time index entry and scan from there till the log end. Broker will only do this . The last time index entry has the largest timestamp if the broker is shutdown normally before.  Broker will only load the largest timestamp if message.format.version is on or after 0.10.0. Otherwise the broker will skip loading the largest timestamp.

Log Truncation

When the log is truncated, because the offset in the time index is also monotonically increasing, we will also truncate the time index entries whose corresponding messages have been truncated. The active segment will reload the largest timestamp after truncation just like it did during startup.

Enforce time based log retention

To enforce time based log retention, the broker will check the last time index entry of a log segment. The timestamp will be the latest timestamp of the messages in the log segment. So if that timestamp expires, the broker will delete the log segment. 

Enforce time based log rolling

 On a recovery after hard failure, the broker will scan the active log segment till the log end.

If there is no time index file for an inactive segment, the broker will create an empty time index file and append a time index entry (last_modification_time_of_the_segment -> base_offset_of_the_next_segment). For the active segment, the broker will create the time index file but leave it empty.

Log Truncation

When the log is truncated, because the offset in the time index is also monotonically increasing, we will also truncate the time index entries whose corresponding messages have been truncated. The active segment will reload the largest timestamp after truncation just like it did during startup.

Enforce time based log retention

To enforce time based log retention, the broker will check the last time index entry of a log segment. The timestamp will be the latest timestamp of the messages in the log segment. So if that timestamp expires, the broker will delete the log segment. 

Enforce time based log rolling

Currently time based log rolling is based on the creating time of the log segment. With this KIP, the time based rolling Currently time based log rolling is based on the creating time of the log segment. With this KIP, the time based rolling would be changed to based on the largest timestamp ever seen in a log segment. A new log segment will be rolled out if current time is greater than largest timestamp ever seen in the log segment + log.roll.ms. When message.timestamp.type=CreateTime, user should set max.message.time.difference.ms appropriately together with log.roll.ms to avoid frequent log segment roll out.

...

  • The messages whose timestamp are after the searched timestamp will be consumed.
  • Some messages with earlier timestamp might also be consumed.

The OffsetRequest behaves almost the same as before. If timestamp T is set in the OffsetRequest, the first offset in the returned offset sequence means that if user want to consume from T, that is the offset to start with. The guarantee is that any message whose timestamp is greater than T has a bigger offset. i.e. Any message before this offset has a timestamp < T.

The time index granularity does not change the actual timestamp searching granularity. It only affects the time needed for searching.

...