Table of Contents |
---|
Status
Current state: Under Discussion
...
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
The Kafka RPC protocol currently supports a single version number per message type. This number determines which version of the schema should be used when writing or reading the message. While this versioning scheme gives us the flexibility to add new fields to the schema over time, there are many scenarios that it doesn't support well.
...
Another scenario is when we want to attach an extra field to a message in a manner that is orthogonal to the normal versioning scheme. For example, we might want to attach a trace ID, a "forwarded-by" field, or a "user-agent" field. It wouldn't make sense to add all these fields to the message schema on the off chance that someone might use them.
Public Interfaces
JSON Schemas
versionsWithOptional
Each Kafka RPC will have a new top-level version field named "versionsWithOptional". This field will contain a version range such as "1+", etc. All of the message versions in this range will support optional fields.
As part of this KIP, we will create a new version of all the existing RPCs. This new version will support optional fields.
Specifying Optional Fields
Optional fields can appear at the top level of a message, or inside any structure.
...
Code Block | ||
---|---|---|
| ||
{ "apiKey": 9000, "type": "response", "name": "FooResponse", "validVersions": "0-9", "versionsWithOptional": "9+", "optionalFields": [ { "name": "UserAgent", "type": "string", "nullable": true, "tag": "0x0001", "about": "The user-agent that sent this request." }, ], "fields": [ { "name": "Foos", "type": "[]Foo", "versions": "0+", "about": "Each foo.", "optionalFields": [ { "name": "Bar", "type": "string", "nullable": false, "tag": "0x0001", "default": "hello world", "about": "The bar associated with this foo, if any." }, ], "fields": [ { "name": "Baz", "type": "int16", "versions": "0+", "about": "The baz associated with this foo." }, ... ] } |
Schema Class
We will add a new constructor to the org.apache.kafka.common.protocol.types.Schema
class which will support optional fields.
Code Block | ||
---|---|---|
| ||
/** * Construct the schema with a given list of its field values * * @param optionalFields The optional fields for this schema. * @param fields The mandatory fields of this schema. * * @throws SchemaException If the given list have duplicate fields */ public Schema(Map<Short, Field> optionalFields, Field... fields); |
Proposed Changes
Serializing Optional Fields
An "optional field buffer" contains a sequence of optional fields. The fields must appear in ascending order, from the lowest-valued tag to the highest-valued tag.
Each entry in the buffer contains a field length, followed by a two-byte tag, followed by the field itself.
Field | Type |
---|---|
Field Length | VARINT |
Field Tag | INT16 |
Field value | <FIELD TYPE> |
The sequence of optional fields is terminated by an entry with a field length of 0. The terminating entry will not contain a tag or value.
Requests and Responses
All requests and responses will begin with an optional field buffer. If there are no optional fields, this will only be a single zero byte.
Structures
All structures will begin with an optional field buffer. This will normally only be a single byte, unless there are optional fields present.
Compatibility, Deprecation, and Migration Plan
As mentioned earlier, existing request versions will not be changed to support optional fields. However, new versions will have this support going forward.
In general, adding or removing an optional field is always a compatible operation, provided that we don't reuse a tag that was used for something else in a previous release. Changing the type or nullability of an existing optional field is also an incompatible change.
Rejected Alternatives
Optional Field Buffer Serialization Alternatives
- We could serialize optional fields as a tag and a type, rather than a tag and a length. However, this would prevent us from adding new types in the future, since the old deserializers would not understand them.
- We could allow the serialization of arrays of objects. However, this would require a two-pass serialization rather than a single-pass serialization. The first pass would have to cache the lengths of all the optional object arrays. We might support this eventually, but for now, it doesn't seem necessary. We can add it later in a compatible fashion if we decide to.
Make all Fields Optional
Rather than supporting both mandatory and optional fields, we could make all fields optional. For fields which we always expect to use, however, this would take more space when serialized. There are also situations where it is useful for the recipient to know which features the sender supports, and the mandatory field mechanism handles these situations well.