Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Update for message format changes from KAFKA-4816

...

Code Block
languagetext
MessageSet => 
  FirstOffset => int64
  Length => int32
  CRCPartitionLeaderEpoch => int32 /* Added for KIP-101 */
  Magic => int8  /* bump up to “2” */
  CRC => int32 /* CRC32C which covers everything from Attributes on */
  Attributes => int16
  LastOffsetDelta => int32 {NEW}
  FirstTimestamp => int64 {NEW}
  MaxTimestamp => int64 {NEW}
  PID => int64 {NEW}
  EpochProducerEpoch => int16 {NEW}
  FirstSequence => int32 {NEW}
  Messages => Message1, Message2, … , MessageN {NEW}[Message]

Message => {ALL FIELDS NEW}
  Length => uintVarvarint
  Attributes => int8
  TimestampDelta => intVarvarint
  OffsetDelta => uintVarvarint
  KeyLen => intVarvarint
  Key => data
  ValueLen => intVarvarint
  Value => data

  Headers => [Header] /* See KIP-82. Note the array uses a varint for the number of headers. */
 
Header => HeaderKey HeaderVal
  HeaderKeyLen => varint
  HeaderKey => string
  HeaderValueLen => varint
  HeaderValue => data

 

The ability to store some fields only at the message set level allows us to conserve space considerably when batching messages into a message set. For example, there is no need to write the PID within each message since it will always be the same for all messages within each message set. In addition, by separating the message level format and message set format, now we can also use variable-length types for the inner (relative) offsets and save considerably over a fixed 8-byte field size.

...