Status

Current state: Under Discussion

Discussion thread: here

JIRA: KAFKA-5575

PR: here

Released: TBD

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).

Motivation

Single message transforms allow you to modify the data, and schema of that data, in flight in Kafka Connect. When writing SMTs for data with schemas, you often have to copy the entire schema, making only minor additions (e.g. for an enrichment transformation). It would be convenient to have a utility method to get a SchemaBuilder pre-populated with the existing schema so enhancing the schema requires minimal work.

Public Interfaces

In connect/api/src/main/java/org/apache/kafka/connect/data/SchemaBuilder.java, add the following API:

/**
 * @param schema the schema to clone the SchemaBuilder from.
 * @return a new SchemaBuilder from the supplied schema.
 */
public static SchemaBuilder from(Schema schema) {...}

Proposed Changes

Add a new from(Schema) static method to SchemaBuilder that will return a SchemaBuilder pre-populated with all the fields of the Schema. This allows adding to a schema inline:

Schema newSchema = SchemaBuilder.from(originalSchema).field("dc", Schema.STRING).build();

Compatibility, Deprecation, and Migration Plan

This is a simple addition to the connect-api and has no compatibility, deprecation, or migration concerns.

Test Plan

Simple unit tests will sufficiently exercise the functionality. One of the existing SMTs could also potentially be converted to use this utility.

Rejected Alternatives

The only other consideration is that this is only useful for additions. Since any field, array, or map schemas are simply copied, this proposal does not cover scenarios like removing or mutating fields, which will also be common. Modifications to SchemaBuilder to make these possible could be considered for future KIPs and the obvious implementations should compose well with this change.

  • No labels