...
- https://issues.apache.org/jira/browse/KAFKA-7624
- https://issues.apache.org/jira/browse/KAFKA-10640
Code Block // TODO: Currently we only support top-level field casting. Ideally we could use a dotted notation in the spec to // allow casting nested fields.
This KIP is aimed aim to include support for nested structures on the existing SMTs.
...
However, dots are already allowed as part of element names on JSON (i.e. Schemaless) records(e.g. {
'nested.key': {'valvalue':42}}
). Instead of escaping them dots with backslashes — which in JSON configurations will lead to unfriendly configurations — it's proposed to follow a similar approach as the CSV formatJSONata[2] to escape double-quotes by preceding it with the same character (double quotes in this case).add field names with dots using backticks, e.g. `nested.key`.value
Double-backticks Then, for transform configurations, double-dots can be used to escape existing dots backticks that are part of the field name.
[1] https://stedolan.github.io/jq/manual/#Basicfilters
[2] https://datatrackerdocs.ietfjsonata.org/doc/html/rfc4180 2.7 > If double quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote.simple#examples
> Field references containing whitespace or reserved tokens can be enclosed in backticks
Examples
Scenario | Field name | Nested path |
---|---|---|
Normal (no dots or backticks on field names) | a.b.c | a: b: c: val |
Field names including dots | a.`b.c` | a: b.c: val |
Field names including backticks | a.b`.c | a: b`: c: val |
Field names including dots and backticks | a.`b``.c` | a: b`c: val |
Affected SMTs
These SMTs will include support for nested structure:
Cast
ExtractField
HeaderFrom
MaskField
ReplaceField
TimestampConverter
ValueToKey
InsertField
HoistField
Non-affected SMTs
These SMTs do not require nested structure support:
DropHeaders
: Drop one or multiple headers.Filter
: Drops the whole message based on a predicate.InsertHeader
: Insert a specific message to the header.RegexRouter
: Acts on the topic name.SetSchemaMetadata
: Acts on root schema.TimestampRouter
: Acts on timestamp.Flatten
: Acts on the whole key or message.
Public Interfaces
From the existing list of the SMTs, there are the following to be impacted by this change:
...
These flags will be added conditionally to some SMTs, as described below.
Affected SMTs
Cast
Changes:
- Extend
spec
to support nested notation.
...
scenario | input | smt | output | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. Nested field. |
|
|
| |||||||||||||||
2. Nested struct, when field names include dots |
|
|
|
Non-affected SMTs
These SMT do not require nested structure support:
...
Compatibility, Deprecation, and Migration Plan
...