Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The timestamp will be assigned by broker upon receiving the message. If the message is coming from mirror maker, the original time stamp will be discarded. In essence, we treat timestamp as another offset-like field. If application needs a timestamp, it can be put into payload.
  • The timestamp will be used to build the log index.
  • The tiimestamp accuracy would be millisecond
  • The timestamp of the outer message of compressed messages will be the latest timestamp of all its inner messages.
    • If the compressed message is not compacted, the relative offsets of inner messages will be contiguous and share the same timestamp.
    • If the compressed message is compacted, the relative offsets of inner messages may not be contiguous. Its timestamp will be the timestamp of the last inner message.
  • The followers will not reassign timestamp but simply update a in memory lastAppendedTimestamp and append the message to the log.
  • To handle leader migration where new leader has slower clock than old leader, all the leader should append max(lastAppendedTimestamp, currentTimeMillis) as the timestamp.

Change the usage of offset field

...

Create another index file for each log segment with name SegmentBaseOffset.time.index to have index at minute level. The time index entry format is:

 

Code Block
languagejava
Time Index Entry => Timestamp Offset
  Timestamp => int64
  Offset => int32

The time index granularity does not change the actual timestamp searching granularity. It only affects the time needed for searching. The way it works will be the same as offset search - find the closet timestamp and corresponding offset, then start the leaner scan over the log until find the target message. The reason we prefer minute level indexing is because timestamp based search is usually rare so it probably does not worth investing significant amount of memory in it.

The time index will be built based on the log index file. Every time when a new entry is inserted into log index file, we take a look at the timestamp of the message and if it falls into next minute, we insert an entry to the time index as well. The following table give the summary of memory consumption using different granularity. The number is calculated based on a broker with 3500 partitions.

...

For upgraded consumers, they can handle both V0 and V1.

Rejected Alternatives

...

Add a timestamp field to log index entry

...

Because the index entry size become 16 bytes instead of 8 bytes. The index file size also needs to be doubled. As an example, one of the broker we have has ~3500 partitions. The index file took about 16GB memory. With this new format, the memory consumption would be 32GB.