Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Status

Current state: "Under Discussion"

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: KAFKA-5092

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Kafka 0.10 added the Timestamp as part of messages, which is particularly important for KStreams application. Currently that timestamp can be set by the broker, or by the producer. This addresses concerns with timestamps set at the producer level.
Currently, if the producer application does not join a timestamp to the ProducerRecord, such as in public ProducerRecord(String topic, K key, V value) , then the timestamp is inferred at send time being System.currentTimeMillis(). If a producer wants to explicitly set a timestamp for the record, it has got to use public ProducerRecord(String topic, Integer partition, Long timestamp, K key, V value). 

This interface is particularly dangerous as it indicates to the user that they should provide an Integer partition. They would have to dive into the code to know that partition can be null, and then will be inferred using the key. I also believe offering partition as an argument goes against the current belief that "the same keys always goes to the same partition". 

This KIP addresses all these concerns, by adding a friendlier more explicit ProducerRecordBuilder interface, which also enables Kafka to easily evolve the message format without adding constructors to the ProducerRecord interfaceadds the missing constructor to add a Timestamp to a ProducerRecord or a SourceRecord

Public Interfaces

The ProducerRecord constructors will all be deprecated, except the main long explicit one:
DeprecatedNew interface:

public ProducerRecordSourceRecord(String topicMap<String, Integer?> partitionsourcePartition, K keyMap<String, V value)public ProducerRecord(?> sourceOffset,
 String topic, KSchema keyvalueSchema, VObject value, Long timestamp)
public ProducerRecord(String topic, V value)

Not deprecated (but ideally protected in the future?) 

public ProducerRecord(SourceRecord(Map<String, ?> sourcePartition, Map<String, ?> sourceOffset,
 String topic, IntegerSchema partitionkeySchema, LongObject timestampkey, KSchema keyvalueSchema, VObject value)

New interface:

public ProducerRecordBuilder(,
 Long timestamp)
public ProducerRecordBuilder withKeyProducerRecord(K key)public ProducerRecordBuilder withValue(String topic, V value)public ProducerRecordBuilder withTimestamp(, Long timestamp)
public ProducerRecordBuilder withTopicProducerRecord(String topic)
public ProducerRecordBuilder withForcedPartition(Integer partition)
public ProducerRecord build()

Proposed Changes

A simple implementation would be:

 

...

languagejava

...

,

...

 K key

...

, V value

...

, Long timestamp

...

)

Proposed Changes

See KAFKA-5092 

Compatibility, Deprecation, and Migration Plan

  • Deprecation of ProducerRecord partial constructors
  • Migrating existing programs to the builder interface should be straightforward

Rejected Alternatives

I had tried to just add partial constructors here: https://github.com/apache/kafka/pull/2800/files

...

  • No deprecation
  • No migration needed for existing programs

Rejected Alternatives

  •  public ProducerRecord(String topic, Long timestamp, K key, V value)

Adding this constructor is quite dangerous because it's almost the same as the one that takes partition (the only difference is that one is a Long and the other is an Integer)
Therefore the ordering of arguments should explicitly prevent mistakes

 

 

  •  Adding a builder interface

Although more newbie friendly, it would imply deprecating the existing constructors, which could take years, because of maintaining code "backward compatibility". There are too many drawbacks and not enough advantages (see mailing list discussion).