Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current stateUnder Discussion

...

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

This KIP includes two parts:

Part 1: Introduce ListOffsetRequest v1 to support accurate search based on timestamp. 

With KIP-33, the brokers can now search messages by timestamp accurately. To maintain the backward compatibility, we did not change the behavior of ListOffsetRequest v0. In this KIP, we will introduce ListOffsetRequest v1 to provide more accurate search.

Part 2: Add a search message by timestamp method to o.a.k.c.consumer.Consumer

In SimpleConsumer, users used to be able to search the offsets by timestamp. This method no longer exists in the KafkaConsumer. We want to introduce the method to allow user to search the offsets by timestamp. This is useful in a few cases, for example:

...

Although the above two parts can be two separate KIPs, but since part 2 will be heavily depending on part 1 and may also potentially impact the semantic for the ListOffsetRequest v1, it might be better to discuss them together to make sure the wire protocol change can satisfy the various use cases on the consumer side.

Public Interfaces

Add ListOffsetRequest(ListOffsetRequest) v1

...

Code Block
languagejava
titleoffsetForTime() method
/**
 * Look up the offsets for the given partitions by timestamp. The returned offset for each partition is the 
 * earliest offset whose timestamp is greater than or equals to the given offset in the corresponding partition.
 * 
 * This is a blocking call. The consumer does not have to be assigned the partitions.
 *
 * @param timestampsToSearch the mapping from partition to the timestamp to look up.
 * @return For each partition, returns the timestamp and offset of the first message with timestamp greater 
than or equal to the target timestamp.
 */
Map<TopicPartition, TimestampOffset> offsetForTime(Map<TopicPartition, Long> timestampsToSearch);

Proposed Changes

ListOffsetRequest/ListOffsetResponse v1

Add ListOffsetRequest/ListOffsetResponse v1 with the following changes compared with ListOffsetRequest/ListOffsetResponse v0

...

Implementation wise, we will migrate to o.a.k.common.requests.ListOffsetRequest class on the broker side.

offsetForTime() method in Consumer

Add a new method to allow users to search messages by timestamp. The input of the method is a mapping from partitions to the target times. The output of the method is a mapping from partitions to the timestamps and offsets of the messages whose timestamps are greater than or equal to the given target time of each partition.

...

Code Block
languagejava
titleconsume from timestamp
long offset = consumer.offsetForTime(Collections.singletonMap(topicPartition, targetTime)).offset;
consumer.seek(topicPartition, offset);
ConsumerRecords records = consumer.poll();

Compatibility, Deprecation, and Migration Plan

This KIP is a pure addition, so there is no backward compatibility concern.

The new ListOffsetRequest v1 will only be used by new consumer. The old high level consumer will still be using ListOffsetRequest v0. We will not update SimpleConsumer either as we are already in the progress of deprecating it. OffsetRequest(ListOffsetRequest) v0 will be deprecated together with old high level consumer and SimpleConsumer.

Test Plan

Test the new consumer against broker with message format before and after 0.10.

Test searchByTimestamp with CreateTime and LogAppendTime

Rejected Alternatives

Using seekToTimestamp() instead of offsetForTime()

  1. Although in most cases users search offset by time to simply consume from that offset. But there are some cases that users just want to get the offsets for checking and not want to move the offsets. (e.g. people checkpointing offsets outside of Kafka)
  2. It is easy for the users to seek to the offset returned by offsetForTime()
  3. A separate offsetForTime() method allows user to query any topic partition without first get the partition assigned.