...
> Field references containing whitespace or reserved tokens can be enclosed in backticks
Examples
Scenario | Field name | Nested path |
---|
Normal (no dots or backticks on field names) | a.b.c | a: b: c: val |
Field names including dots | a.`b.c` | a: b.c: val |
Field names including backticks | a.b`.c | a: b`: c: val |
Field names including dots and backticks | a.`b``.c` | a: b`c: val |
Field names wrapped by backticks | a.``b``.c | a: `b`: c: val |
Affected SMTs
These SMTs will include support for nested structure:
...
scenario | input | smt | output |
---|
1. Nested field. |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.Cast$Value",
"transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.spec": "k1:string,parent.child.k2:int64"
} |
|
Code Block |
---|
| {
"k1": "123",
"parent": {
"child": {
"k2": 123
}
}
} |
|
2. Nested field, when field names include dots |
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": "123"
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.Cast$Value",
"transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.spec": "k1:string,parent`parent.child`.child.k2:int64"
} |
|
Code Block |
---|
| {
"k1": "123",
"parent.child": {
"k2": 123
}
} |
|
...
scenario | input | smt | output |
---|
1. Nested field. |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ExtractField$Value",
"transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.field": "parent.child.k2"
} |
| |
2. Nested field, when field names include dots |
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": "123"
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ExtractField$Value",
"transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.field": "parent`parent..childchild`.k2"
} |
| |
3. Nested field, an object returned. |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ExtractField$Value",
"transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.field": "parent.child"
} |
|
Code Block |
---|
| { "k2": "123" } |
|
...
scenario | input | smt | output |
---|
1. Nested field. |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.HeaderFrom$Value",
"transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.fields": "k1,parent.child.k2",
"transforms.smt1.headers": "k1,k2"
} |
|
Code Block |
---|
| headers:
- k1=123
- k2="123" |
|
2. Nested field, when field names include dots |
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": "123"
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.HeaderFrom$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.fields": "k1,parent`parent.child`.child.k2",
"transforms.smt1.headers": "k1,k2"
} |
|
Code Block |
---|
| headers:
- k1=123
- k2="123" |
|
...
scenario | input | smt | output |
---|
1. Nested field. |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.MaskField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.fields": "parent.child.k2"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": ""
}
}
} |
|
2. Nested field, when field names include dots |
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": "123"
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.MaskField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.fields": "parent`parent..childchild`.k2"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": ""
}
} |
|
...
scenario | input | smt | output |
---|
1. Nested field. Drop field |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ReplaceField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.exclude": "parent.child.k2"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
}
}
} |
|
2. Nested field. Drop struct |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ReplaceField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.exclude": "parent.child"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent": {
}
} |
|
3. Nested field. Include field |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123",
"k3": "234"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ReplaceField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.include": "parent.child.k2"
} |
|
Code Block |
---|
| {
"parent": {
"child": {
"k2": "123"
}
}
} |
|
4. Nested field. Include struct |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123",
"k3": "234"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ReplaceField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.include": "parent.child"
} |
|
Code Block |
---|
| {
"parent": {
"child": {
"k2": "123",
"k3": "234"
}
}
} |
|
5. Nested field, when field names include dots |
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": "123"
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ReplaceField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.renames": "parent`parent.child`.child.k2:field2"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"field2": "123"
}
}
|
|
...
scenario | input | smt | output |
---|
1. Nested field. |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": 1556204536000 }
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.TimestampConverter$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.field": "parent.child.k2",
"transforms.smt1.format": "yyyy-MM-dd",
"transforms.smt1.target.type": "string"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "2014-04-25" }
}
} |
|
2. Nested field, when field names include dots |
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": 1556204536000 }
}
} |
|
Code Block |
---|
|
{
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.TimestampConverter$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.field": "parent`parent.child`.child.k2",
"transforms.smt1.format": "yyyy-MM-dd",
"transforms.smt1.target.type": "string"
}
|
|
Code Block |
---|
| {
"k1": 123,
"parent.child": { "k2": "2014-04-25" }
} |
|
...
scenario | input | smt | output |
---|
1. Nested field. |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ValueToKey", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.fields": "parent.child.k2"
} |
| |
2. Nested struct to Key. |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ValueToKey", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.fields": "parent.child"
} |
|
Code Block |
---|
{
"k2": "123"
} |
|
3. Nested field, when field names include dots |
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": "123"
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.ValueToKey", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.fields": "parent`parent.child`.child.k2"
} |
| |
...
scenario | input | smt | output |
---|
1. Nested field. |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.InsertField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.static.field": "parent.child.k3"
"transforms.smt1.static.value": "v3"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123",
"k3": "v3"
}
}
} |
|
2. Nested field, when field names include dots |
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": "123"
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.InsertField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.static.field": "parent`parent.child`.child.k3"
"transforms.smt1.static.value": "v3"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": "123",
"k3": "v3"
}
} |
|
3. Nested field with the parent missing |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.InsertField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.static.field": "parent.other.k3"
"transforms.smt1.static.value": "v3"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
},
"other": {
"k3": "v3"
}
}
} |
|
4. Nested field with the parent missing, and ignore is set |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.InsertField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.static.field": "parent.other.k3"
"transforms.smt1.static.value": "v3",
"transforms.smt1.field.on.missing.parent": "ignore"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
5. Nested field with the parent missing |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.InsertField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.static.field": "parent.child.k2"
"transforms.smt1.static.value": "456"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "456"
}
}
} |
|
6. Nested field with the parent missing, and ignore is set |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.InsertField$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.static.field": "parent.child.k2"
"transforms.smt1.static.value": "456",
"transforms.smt1.field.on.existing.field": "ignore"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
...
scenario | input | smt | output |
---|
1. Nested field. |
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"k2": "123"
}
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.HoistFIeld$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.hoisted": "parent.child.k2",
"transforms.smt1.field": "other"
} |
|
Code Block |
---|
| {
"k1": 123,
"parent": {
"child": {
"other": {
"k2": "123"
}
}
}
} |
|
2. Nested struct, when field names include dots |
Code Block |
---|
| {
"k1": 123,
"parent.child": {
"k2": "123"
}
} |
|
Code Block |
---|
| {
"transforms": "smt1",
"transforms.smt1.type": "org.apache.kafka.connect.transforms.HoistFIeld$Value", "transforms.smt1.field.syntax.version": "v2",
"transforms.smt1.hoisted": "parent`parent..childchild`",
"transforms.smt1.field": "other"
}
|
|
Code Block |
---|
| {
"k1": 123,
"other": {
"parent.child": {
"k2": "123"
}
}
} |
|
...
Instead of adding a configuration under each field config, e.g. include.syntax.version
, the KIP proposed to have a single configuration per SMT, to affect all the input fields.
Use Double-dots to escape dots included on field names
Double dot is often used in JSON Path as a descendant selector, see https://www.ietf.org/id/draft-ietf-jsonpath-base-05.html
This may cause confusion on users. To avoid this, the backtick approach is proposed in this KIP.
Potential Improvements (out of scope)
...