Status
Current state: "Draft"
Discussion thread:
JIRA: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Status
Current state: "Draft"
Discussion thread: here
JIRA: here
Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast).
Motivation
Schema registry allows producers to define a schema for the messages that are being sent to kafka and consumers to apply the schema to read from.This will greatly benefit all the clients from streaming, batch usecases to understand the schema of a message and perform actions accordingly.
Design - Approach - 1
Adding a Schema for a topic
Users can use kafka-topics.sh tool to add a schema for a topic. This schema will be part of topic config.
Schema will contain version and as the schema evolves Kafka will update the version to a higher version.
Querying for a Schema
Schema will be part of TopicMetadataRequest and Java producer and consumer clients will have that in their local cache after a succesful TopicMetadataRequest.
Other clients can rely on sending a SchemaRequest . This SchemaRequest will contain one or more topics along with version of schema they would like to access.
Serializers, Deserializers
Once the Producer have the serializer plugged in, it will use this schema to validate and serialize the bytes and producer will add schema version to the Message.
Similarly on the Consumer side depending on which schema.version does the message have , it applies that particular schema to deserialize.
If the message doesn't adhere to a schema than the serializer / deserializer throw an exception.
class InvalidMessageException {}
Configuration
Users needs to add schema.version for each topic and clients will apply the schema with that version.
If there is no schema available than we will throw following exception
class SchemaNotAvailableException {}
Design - Approach - 2
Schema evolution
Advantages
Disadvantages: