Option 1	Option 2
Accuracy of Searching by time	Millisecond	Locate to the first message in the log falls into the minute.
Order of timestamp in actual log	monotonically increasing	out of order
Broker log retention / rolling policy enforcement	Simple to implement	Need to implement separately
Exposure of LogAppendTime to user?	Yes	Not necessarily needed
Memory consumption	Using memory mapped file. Typically less memory is needed than option 2	All entry are in memory. Memory footprint is higher than Option 1
Complexity	Both options are similar for indexing	Similar to Option 1, but needs separate design to honor log retention / rolling
Application friendliness	User need to track CreateTime (assuming we include it in message) and LogAppendTime (See further discussion in Use case discussion section)	User only need to track CreateTime

Use case discussion

	Use case	Goal	Solution with LogAppendTime index	Solution with CreateTime index	Comparison
1	Search by timestamp	Not lose messages	If user want to search for a message with CreateTime CT. They can use CT to search in the LogAppendTime index. Because LogAppendTime > CT for the same message (assuming no skew clock). If the clock is skewed, people can search with CT - X where X is the max skew. If user want to search for a message with LogAppendTime LAT, they can just search with LAT and get a millisecond accuracy.	User can just search with CT and get a minute level granularity offset.	If the latency in the pipeline is greater than one minute, user might consume less message by using CreateTime index. Otherwise, LogAppendTime index is probably preferred. Consider the following case: A message m1 with CreateTime CT arrives broker at LAT1. Some time later at LAT2, another message m2 with CreateTime CT arrives at broker. If user want to search with CT after they consumed m2, they will have to reconsume from m1. Depending on how big LAT2 - LAT1 is, the amount of messages to be reconsumed can be very big.
2	Search by timestamp (bootstrap)	Not lose messages Consume less duplicate messages	In bootstrap case, all the LAT would be close. For example If user want to process the data in last 3 days and did the following: Dump a big database into Kafka Reprocess the message in last 3 days. In this case, LogAppendTime index does not help too much. That means user needs to filter out the data older than 3 days before dumping them into Kafka.	In bootstrap case, the CreateTime will not change, if user follow the same procedure started in LogAppendTime index section. Searching by timestamp will work.	LogAppendTime index needs further attention from user.
3	Failover from cluster 1 to cluster 2	Not lose messages Consume less duplicate messages	Similar search by timestamp. User can choose to use CT or LAT of cluster 1 to search on cluster 2. In this case, searching with CT - MaxLatencyOfCluster will provide strong guarantee on not losing messages, but might have some duplicates depending on the difference in latency between cluster 1 and cluster 2.	User can use CT to search and get minute level granularity. Duplicates are still not avoidable.	In cross cluster fail over case, both solution can provide strong guarantee of not losing messages. Under conditions of
4	Get lag for consumers by time	Know how long a consumer is lagging by time Alert when a consumer starts to lag.	With LogAppendTime in the message, consumer can easily find out the lag by time and estimate how long it might need to reach the log end.	Not supported.
5	Broker side latency metric	Let the broker to report latency of each topic. i.e. LAT - CT	The latency can be simply reported as LAT - CT.	The latency can be reported as System.currentTimeMillis - CT	The two solutions are the same. This latency information can be used for MaxLatencyOfCluster in use case 3.

From the use cases list above, generally having a LogAppendTime index is better than having a CreateTime based timestamp.

Compatibility, Deprecation, and Migration Plan

...

Space shortcuts

Child pages

Versions Compared

Old Version 6

New Version 7

Key

Use case discussion

Compatibility, Deprecation, and Migration Plan

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 6

New Version 7

Key

Use case discussion

Compatibility, Deprecation, and Migration Plan