Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Variable Length Primitives

bytes16, bytes32bytes, string - These types consist of a signed integer giving a length N followed by N bytes of content. A length of -1 indicates null. bytes16 and string use a two byte int16 size and bytes32 uses a four byte int32 size. string16 is identical in format to bytes16 but the bytes should be interpreted as UTF8 encoded charactersstring uses an int16 for its size, and bytes uses an int32.

Arrays

Wiki Markup
This is a notation for handling repeated structures. These will always be encoded as an uint32 size containing the length N followed by N repetitions of the structure which can itself be made up of other primitive types. In the BNF grammars below we will show an array of a structure foo as \[foo\].

...

Code Block
Message => Crc MagicByte Attributes Key Value
  Crc => int32
  MagicByte => int8
  Attributes => int8
  Key => bytes32bytes
  Value => bytes32bytes

Field

Description

Offset

This is the offset used in kafka as the log sequence number. When the producer is sending messages it doesn't actually know the offset and can fill in any value here it likes.

Crc

The CRC is the CRC32 of the remainder of the message bytes. This is used to check the integrity of the message on the broker and consumer.

MagicByte

This is a version id used to allow backwards compatible evolution of the message binary format.

Attributes

This byte holds metadata attributes about the message. In particular the last 3 bits contain the compression codec used for the message.

Key

The key is an optional message key that was used for partition assignment. The key can be null.

Value

The value is the actual message contents as an opaque byte array. Kafka supports recursive messages in which case this may itself contain a message set.

...

Code Block
FetchRequest => ReplicaId MaxWaitTime MinBytes [TopicName [Partition FetchOffset MaxBytes]]
  ReplicaId => int32
  MaxWaitTime => uint32
  MinBytes => uint32
  TopicName => string
  Partition => uint32

  FetchOffset => int64
  MaxBytes => uint32

...

Code Block
FetchResponse => [TopicName [Partition ErrorCode FetchedOffset HighwaterMarkOffset MessageSetSize MessageSet]]
  TopicName => string
  Partition => unit32
  ErrorCode => int16
  FetchedOffset => uint64
  HighwaterMarkOffset => int64
  MessageSetSize => int32

Field

Description

TopicName

The name of the topic this response entry is for.

Partition

The id of the partition this response is for.

FetchedOffset

The offset from which the fetch began.

HighwaterMarkOffset

The offset at the end of the log for this partition. This can be used by the client to determine how many messages behind the end of the log they are.

MessageSetSize

The size in bytes of the message set for this partition

MessageSet

The message data fetched from this partition, in the format described above.

...

Code Block
OffsetResponse => [TopicName [PartitionOffsets]]
  PartitionOffsets => Partition ErrorCode [Offset]
  Partition => int32
  ErrorCode => int16
  Offset => int64

Api Keys

The following are the numeric codes that the ApiKey in the request can take for each of the above request types.

API name

ApiKey Value

ProduceRequest

0

FetchRequest

1

OffsetRequest

2

MetadataRequest

3

Error Codes

We use numeric codes to indicate what problem occurred on the server. These can be translated by the client into exceptions or whatever the appropriate error handling mechanism in the client language. Here is a table of the error codes currently in use:

...