Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Currently, the Kafka Connect SMT TimestampConverter can convert Timestamp from multiples sources types (String, Unix Long or Date) into different target types (String, Unix Long or Date).

The problem is that Unix Long as a source or as a target type is required to be epoch in milliseconds.

...

When such case arise, Kafka Connect can't do anything expect pass along the Unix Long and leave the conversion to another layer.

...

  • TimeUnit.MILLISECONDS.toMicros(epochMilis) and so on for the other conversions seems the easiest way. 

Unix Long to Timestamp example:

Code Block
languageyml
"transforms.TimestampConverter.type": "org.apache.kafka.connect.transforms.TimestampConverter$Value",
"transforms.TimestampConverter.field": "event_date_long",
"transforms.TimestampConverter.epoch.precision": "micros",
"transforms.TimestampConverter.target.type": "Timestamp"

String to Unix Long nanoseconds example:

Code Block
languageyml
"transforms.TimestampConverter.type": "org.apache.kafka.connect.transforms.TimestampConverter$Value",
"transforms.TimestampConverter.field": "event_date_str",
"transforms.TimestampConverter.format": "yyyy-MM-dd'T'HH:mm:ss.SSS",
"transforms.TimestampConverter.target.type": "unix",
"transforms.TimestampConverter.epoch.precision": "nanos"

java.util.Date and SimpleDateFormat limitations

Since these classes can only handle precisions down to the millisecond, it should be noted that:

  • converting source Unix Long microseconds or nanos into any target type leads to a precision loss (truncation after millis)
  • converting any source type into target Unix Long microseconds or nanos, the part after milliseconds will always be 0
  • A KIP that address Date vs Instant may be more appropriate but it impacts so much of the code that I believe this is a good first step.

...