You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Status

Current state: Under Discussion

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: here [Change the link from KAFKA-1 to your own ticket]

Motivation

Currently there are two error handling options in Kafka Connect, “none” and “all”. Option “none” will config the connector to fail fast, and option “all” will ignore broken records.

If users want to store their broken records, they have to config a broken record queue, which is too much work for them in some cases. 

Some sink connectors have the ability to deal with broken records, for example, a JDBC sink connector can store the broken raw bytes into a separate table, a S3 connector can store that in a zipped file.

Therefore, it would be idea if Kafka Connect provides an additional option that sends the broken raw bytes to SinkTask directly. 

Public Interfaces

In Kafka Connect, the configuration

errors.tolerance

will have a third option "continue" besides "none" and "all"

Proposed Changes

Add a third option to error handling, which should behave like “continue” when error occurs at Converter or SMT. The infrastructure should send the broken byte message directly to SinkTask.

SinkTask is then responsible for handling the unparsed bytes input.

The benefits of having this additional option are:

  • Being user friendly. Connectors can handle broken record and hide that from clients.
  • Providing more flexibility to SinkTask in terms of broken record handling.

Compatibility, Deprecation, and Migration Plan

There is no compatibility issue. Behavior of Kafka Connect does not change unless user explicitly specify the error handling method to be "continue". 

Rejected Alternatives

None

  • No labels