Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagetext
MessageSet => 
  FirstOffset => int64
  Length => int32
  CRC => int32
  Magic => int8  /* bump up to “2” */
  Attributes => int16
  OffsetDeltaLastOffsetDelta => int32 {NEW}
  FirstTimestamp => int64 {NEW}
  MaxTimestampDelta => int64 {NEW}
  PID => int64 {NEW}
  Epoch => int16 {NEW}
  FirstSequence => int32 {NEW}
  Messages => Message1, Message2, … , MessageN {NEW}

Message => {ALL FIELDS NEW}
  Length => uintVar
  Attributes => int8
  TimestampDelta => intVar
  OffsetDelta => uintVar
  KeyLen => uintVar [OPTIONAL]
  Key => data [OPTIONAL]
  Value => data [OPTIONAL]

...

  1. Having easy access to the offset of the first message allows us to stream messages to the user on demand. In the existing format, we only know the last offset in each message set, so we have to read the messages fully into memory in order to compute the offset of the first message to be returned to the user.

  2. As before, the message set header has a fixed size. This is important because it allows us to do in-place offset/timestamp assignment on the broker before writing to disk. The message set CRC covers the header and message data. Alternatively, we could let it cover only the header, but if compressed data is corrupted, then decompression may fail with obscure errors.

  3. We have preserved removed the per-message CRC in this format. We considered removing it since the message set CRC covers the data, but some auditing applications depend on individual messages having their own CRC. To make computation and validation easier, we have located it at the end of hesitated initially to do so because of its use in some auditing applications for end-to-end validation. The problem is that it is not safe, even currently, to assume that the CRC seen by the producer will match that seen by the consumer. One case where it is not preserved is when the topic is configured to use the log append time. Another is when messages need to be up-converted prior to appending to the log. For these reasons, and to conserve space and save computation, we have removed the CRC and deprecated client usage of these fields.

  4. The message set CRC covers the header and message data. Alternatively, we could let it cover only the header, but if compressed data is corrupted, then decompression may fail with obscure errors. Additionally, that would require us to add the message-level CRC back to the message.

  5. Individual messages within a message set have their full size (including header, key, and value) as the first field. This is designed to make deserialization efficient. As we do for the message set itself, we can read the size from the input stream, allocate memory accordingly, and do a single read up to the end of the message. This also makes it easier to skip over the messages if we are looking for a particular one, which potentially saves us from copying the key and value.

  6. We have not included a field for the size of the value in the message schema since it can be computed directly using the message size and the length of the header and key.

  7. We have used a variable length integer to represent timestamps. Our approach is to let the first message

Space Comparison

As the batch size increases, the overhead of the new format grows smaller compared to the old format because of the eliminated redundancy. The overhead per message in the old format is fixed at 34 bytes. For the new format, the message set overhead is 45 53 bytes, while per-message overhead ranges from 14 6 to 25 bytes. This makes it more costly to send individual messages, but space is quickly recovered with even modest batching. For example, assuming a fixed message size of 1K with 100 byte keys and reasonably close timestamps, the overhead increases by only 17 7 bytes for each additional batched message :

...

(2 bytes for the message size, 1 byte for attributes, 2 bytes for timestamp delta, 1 byte for offset delta, and 1 byte for key size) :

Batch Size

Old Format Overhead

New Format Overhead

1

34*1 = 34

45

53 + 1*

17

7 =

62

60

3

34*3 = 102

45

53 + 3*

17

7 =

96

74

10

34*10 = 340

45

53 + 10*

17

7 =

215

123

50

34*50 = 1700

45

53 + 50*

17

7 =

895

403

100

34*100 = 3400

45 + 100*

17

7 =

1745

745

 

Compatibility, Deprecation, and Migration Plan

...