Table of Contents |
---|
Introduction
This is a draft document. We hope to turn it into a useful guide to implementing a Kafka client, so any feedback on how to make it more useful would be much appreciated. The 0.8 release has not yet happened so there may yet be changes in the protocol described, though we hope to minimize them. We feel this is stable enough that people can build against it without major disruption.
Introduction
This document covers the protocol implemented in Kafka 0.8 and beyond. It is meant to give a readable guide to the protocol that covers the available requests, their binary format, and the proper way to make use of them to implement a client. This document assumes you understand the basic design and terminology described here.
The protocol used in 0.7 and earlier is similar to this, but we chose to make a one document covers the protocol implemented in Kafka 0.8 and beyond. It is meant to give a readable guide to the protocol that covers the available requests, their binary format, and the proper way to make use of them to implement a client. This document assumes you understand the basic design and terminology described here.The protocol used in 0.7 and earlier is similar to this, but we chose to make a one time (we hope) break in compatibility to be able to clean up cruft and generalize things.
Overview
The Kafka protocol is fairly simple, there are only four client requests APIs.
...
Each of these will be described in detail below.
Preliminaries
Network
Kafka uses a binary protocol over TCP. The protocol defines all apis as request response message pairs. All messages are size delimited and are made up of the following primitive types.
...
Field | Description |
---|---|
MessageSize | The MessageSize field gives the size of the subsequent request or response message in bytes. The client can read requests by first reading this 4 byte size as an integer N, and then reading and parsing the subsequent N bytes of the request. |
...
Requests
...
Requests all have the following formatA request looks like this:
Code Block |
---|
RequestMessage => ApiKey ApiVersion CorrelationId ClientId RequestMessage ApiKey => int16 ApiVersion => int16 CorrelationId => int32 ClientId => string RequestMessage => MetadataRequest | ProduceRequest | FetchRequest | OffsetRequest |
Field | Description | |
---|---|---|
ApiKey | This is a numeric id for the API being invoked (i.e. is it a metadata request, a produce request, a fetch request, etc). | |
ApiVersion | This is a numeric version number for this api. We version each API and this version number allows the server to properly interpret the request as the protocol evolves. Responses will always be in the format corresponding to the request version. Currently the supported version for all APIs is 0. | |
CorrelationId | CorrelationId | This is a user-supplied integer. It will be passed back in the response by the server, unmodified. It is useful for matching request and response between the client and server. |
ClientId | This is a user supplied identifier for the client application. The user can use any identifier they like and it will be used when logging errors, monitoring aggregates, etc. |
The various request and response messages will be described below.And the response:
...
Responses
...
Code Block |
---|
Response => ResponseVersion CorrelationId ResponseMessage CorrelationId => int32 ResponseMessage => MetadataResponse | ProduceResponse | FetchResponse | OffsetResponse |
...
Code Block |
---|
MetadataRequest => [TopicName] TopicName => string |
Metadata Response
Field | Description |
---|---|
TopicName | The topics to produce metadata for. If empty the request will yield metadata for all topics. |
Metadata Response
The response contains metadata for each partition, with partitions grouped together by topic. This metadata refers to brokers by their broker id. The brokers each have a host and port.
Code Block |
---|
MetadataResponse => [TopicMetadata][Broker]
TopicMetadata => |
Code Block |
MetadataResponse => [TopicMetadata][Brokers] TopicMetadata => TopicErrorCode TopicName [PartitionMetadata] PartitionMetadata => PartitionErrorCode PartitionId LeaderExists Leader [Replica]Replicas [Isr] PartitionErrorCode => int16 PartitionId => int32 LeaderExists => int8 Leader => int32 Replicas => [int32] Isr => [int32] Broker => NodeId Host Port NodeId => int32 Host => string Port => int32 |
Field | Description |
---|---|
Leader | The node id for the kafka broker currently acting as leader for this partition. If no leader exists because we are in the middle of a leader election this id will be -1. |
Replicas | The set of alive nodes that currently acts as slaves for the leader for this partition. |
Isr | The set subset of the replicas that are "caught up" to the leader |
Broker | The node id, hostname, and port information for a kafka broker |
Produce API
The produce API is used to send message sets to the server. For efficiency it allows sending message sets intended for many topic partitions in a single request.
...