Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

int8, int16, int32, int64, uint8, uint16, uint32, uint64 - Integers  - Signed integers with the given precision (in bits) stored in big endian order. unitXX variants are unsigned and have double the range.

Variable Length Primitives

...

Wiki Markup
This is a notation for handling repeated structures. These will always be encoded as an uint32int32 size containing the length N followed by N repetitions of the structure which can itself be made up of other primitive types. In the BNF grammars below we will show an array of a structure foo as \[foo\].

...

Code Block
RequestOrResponse => MessageSize (RequestMessage | ResponseMessage)
  MessageSize => uint32int32

Field

Description

MessageSize

The MessageSize field gives the size of the subsequent request or response message in bytes. The client can read requests by first reading this 4 byte size as an integer N, and then reading and parsing the subsequent N bytes of the request.

...

Code Block
RequestMessage => ApiKey ApiVersion CorrelationId ClientId RequestMessage
  ApiKey => uint16int16
  ApiVersion => uint16int16
  CorrelationId => int32
  ClientId => string
  RequestMessage => MetadataRequest | ProduceRequest | FetchRequest | OffsetRequest

...

Code Block
Response => ResponseVersion CorrelationId ResponseMessage
ResponseVersion => int16
CorrelationId => int32
ResponseMessage => MetadataResponse | ProduceResponse | FetchResponse | OffsetResponse

Field

Description

ResponseVersion

Indicates the version of the version of the response object.

CorrelationId

The server passes back whatever integer the client supplied as the correlation in the request.

...

Code Block
MetadataResponse => [TopicMetadata]
  TopicMetadata => TopicErrorCode TopicName [PartitionMetadata]
  PartitionMetadata => PartitionErrorCode PartitionId LeaderExists Leader Replicas Isr
  PartitionErrorCode => int16
  PartitionId => unit32int32
  LeaderExists => int8
  Leader => Broker
  Replicas => [Broker]
  Isr => [Broker]
  Broker => NodeId CreatorId Host Port
  NodeId => uint32int32
  CreatorId => string
  Host => string
  Port => uint32int32

Produce API

The produce API is used to send message sets to the server. For efficiency it allows sending message sets intended for many topic partitions in a single request.

...

Code Block
ProduceRequest => RequiredAcks Timeout [TopicName [Partition MessageSetSize MessageSet]]
  RequiredAcks => int16
  Timeout => uint32int32
  Partition => uint32int32
  MessageSetSize => uint32int32

Field

Description

RequiredAcks

This field indicates how many acknowledgements the servers should receive before responding to the request. If it is 0 the server responds immediately prior to even writing the data to disk. If it is 1 the data is written to the local machine only with no blocking on replicas. If it is -1 the server will block until the message is committed by all in sync replicas. For any number > 1 the server will block waiting for this number of acknowledgements to occur (but the server will never wait for more acknowledgements than there are in-sync replicas).

Timeout

This provides a maximum time the server can await the receipt of the number of acknowledgements in RequiredAcks. The timeout is not an exact limit on the request time for a few reasons: (1) it does not include network latency, (2) the timer begins at the beginning of the processing of this request so if many requests are queued due to server overload that wait time will not be included, (3) we will not terminate a local write so if the local write time exceeds this timeout it will not be respected. To get a hard timeout of this type the client should use the socket timeout.

TopicName

The topic that data is being published to.

Partition

The partition that data is being published to.

MessageSetSize

The size, in bytes, of the message set that follows.

MessageSet

A set of messages in the standard format described above.

...

Code Block
ProduceResponse => [TopicName [Partition ErrorCode Offset]]
  TopicName => string
  Partition => unit32int32
  ErrorCode => int16
  Offset => int64

...

Code Block
FetchRequest => ReplicaId MaxWaitTime MinBytes [TopicName [Partition FetchOffset MaxBytes]]
  ReplicaId => int32
  MaxWaitTime => uint32int32
  MinBytes => uint32int32
  TopicName => string
  Partition => uint32int32
  FetchOffset => int64
  MaxBytes => uint32int32

Field

Description

ReplicaId

The replica id indicates the node id of the replica initiating this request. Normal client consumers should always specify this as -1 as they have no node id. Other brokers set this to be their own node id. The value -2 is accepted to allow a non-broker to issue fetch requests as if it were a replica broker for debugging purposes.

MaxWaitTime

The max wait time is the maximum amount of time to block waiting if insufficient data is available at the time the request is issued.

MinBytes

This is the minimum number of bytes of messages that must be available to give a response. If the client sets this to 0 the server will always respond immediately, however if there is no new data since their last request they will just get back empty message sets. If this is set to 1, the server will respond as soon as at least one partition has at least 1 byte of data or the specified timeout occurs. By setting higher values in combination with the timeout the consumer can tune for throughput and trade a little additional latency for reading only large chunks of data (e.g. setting MaxWaitTime to 100 ms and setting MinBytes to 64k would allow the server to wait up to 100ms to try to accumulate 64k of data before responding).

TopicName

The name of the topic.

Partition

The id of the partition the fetch is for.

FetchOffset

The offset to begin this fetch from.

MaxBytes

The maximum bytes to include in the message set for this partition. This helps bound the size of the response.

...

Code Block
FetchResponse => [TopicName [Partition ErrorCode FetchedOffset HighwaterMarkOffset MessageSetSize MessageSet]]
  TopicName => string
  Partition => unit32int32
  ErrorCode => int16
  HighwaterMarkOffset => int64
  MessageSetSize => int32

...

Code Block
OffsetRequest => [TopicName [Partition Time MaxNumberOfOffsets]]
  TopicName => string
  Partition => uint32int32
  Time => uint64int64
  MaxNumberOfOffsets => int32

...