Current state: Under Discussion
Discussion thread: here
JIRA: KAFKA-5575
PR: here
Released: TBD
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Single message transforms allow you to modify the data, and schema of that data, in flight in Kafka Connect. When writing SMTs for data with schemas, you often have to copy the entire schema, making only minor additions (e.g. for an enrichment transformation). It would be convenient to have a utility method to get a SchemaBuilder pre-populated with the existing schema so enhancing the schema requires minimal work.
In connect/api/src/main/java/org/apache/kafka/connect/data/SchemaBuilder.java, add the following API:
/** * @param schema the schema to clone the SchemaBuilder from. * @return a new SchemaBuilder from the supplied schema. */ public static SchemaBuilder from(Schema schema) {...} |
Add a new from(Schema)
static method to SchemaBuilder
that will return a SchemaBuilder
pre-populated with all the fields of the Schema
. This allows adding to a schema inline:
Schema newSchema = SchemaBuilder.from(originalSchema).field("dc", Schema.STRING).build(); |
This is a simple addition to the connect-api
and has no compatibility, deprecation, or migration concerns.
Simple unit tests will sufficiently exercise the functionality. One of the existing SMTs could also potentially be converted to use this utility.
The only other consideration is that this is only useful for additions. Since any field, array, or map schemas are simply copied, this proposal does not cover scenarios like removing or mutating fields, which will also be common. Modifications to SchemaBuilder
to make these possible could be considered for future KIPs and the obvious implementations should compose well with this change.