...
Use case discussion
Use case | Goal | Solution with Index based on LogAppendTime indexSolution with | Index based on CreateTime index | Comparison | |
---|---|---|---|---|---|
1 | Search by timestamp | Not lose messages | If user want to search for a message with CreateTime CT. They can use CT to search in the LogAppendTime index. Because LogAppendTime > CT for the same message (assuming no skew clock). If the clock is skewed, people can search with CT - X where X is the max skew. If user want to search for a message with LogAppendTime LAT, they can just search with LAT and get a millisecond accuracy. | User can just search with CT and get a minute level granularity precise offset. | If the latency in the pipeline is greater than one minute, user might consume less message by using CreateTime index. Otherwise, LogAppendTime index is probably preferred.User may see duplicates when searching by CreateTime. Consider the following case:
If user want to search with CT after they consumed m2, they will have to reconsume from m1. Depending on how big LAT2 - LAT1 is, the amount of messages to be reconsumed can be very big. |
2 | Search by timestamp (bootstrap) |
| In bootstrap case, all the LAT would be close. For example If user want to process the data in last 3 days and did the following:
In this case, LogAppendTime index does not help too much. That means user needs to filter out the data older than 3 days before dumping them into Kafka. | In bootstrap case, the CreateTime will not change, if user follow the same procedure started in LogAppendTime index section. Searching by timestamp will work. | LogAppendTime index needs further attention from user. |
3 | Failover from cluster 1 to cluster 2 |
| Similar search by timestamp. User can choose to use CT or LAT of cluster 1 to search on cluster 2. In this case, searching with CT - MaxLatencyOfCluster will provide strong guarantee on not losing messages, but might have some duplicates depending on the difference in latency between cluster 1 and cluster 2. | User can use CT to search and get minute level granularityprecise offset. Duplicates are still not avoidable. There can be some tricky cases here. Consider the following case [1]:
In this case, m1 is created before m2. Due to latency difference, m1 arrives cluster 1 then m2 does, m2 arrives cluster 2 before m1 does. If a consumer consumed m2 in cluster 2 and fail over to cluster 1, simply search by CT2 will miss m1 because m1 has larger offset than m2 in cluster 2 but smaller offset than m2 in cluster 1. So the same trick or CT - MaxLatencyOfCluster is still needed. | In cross cluster fail over case, both solution can provide strong guarantee of not losing messages. But both needs to depend on the knowledge of MaxLatencyOfCluster. |
4 | Get lag for consumers by time | Know how long a consumer is lagging by time. | With LogAppendTime in the message, consumer can easily find out the lag by time and estimate how long it might need to reach the log end. | Not supported. | |
5 | Broker side latency metric | Let the broker to report latency of each topic. i.e. LAT - CT | The latency can be reported as LAT - CT. | The latency can be reported as System.currentTimeMillis - CT | The two solutions are the same. This latency information can be used for MaxLatencyOfCluster in use case 3. |
...