Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
baseOffset: int64
batchLength: int32 
partitionLeaderEpoch: int32
magic: int8 (current magic value is 2) // <-- will bump to 3
crc: int32
attributes: int16
    bit 0~2:
        0: no compression
        1: gzip
        2: snappy
        3: lz4
        4: zstd
    bit 3: timestampType
    bit 4: isTransactional (0 means not transactional)
    bit 5: isControlBatch (0 means not a control batch)
    bit 6: hasDeleteHorizonMs (0 means baseTimestamp is not set as the delete horizon for compaction)
    // new added attribute below
    bit 7: ignoreMessageAttributes (0 means not to ignore)
    bit 8~15: unused

producerId: int64
producerEpoch: int16
baseSequence: int32
lastOffsetDelta: int32 // <-- will change to varint
baseTimestamp: int64 // <-- will change to varlong
maxTimestamp: int64 // <-- will change semantic to maxTimestampDelta and change type to varint
producerId: int64
producerEpoch: int16
baseSequence: int32
records: [Record]


Furthermore, the record batch header can also be smaller. I'd also like to improve them by:

...

3. lastOffsetDelta: change the type from int32 to varint. Same as above, In most case, the offset delta should be small. So, changing to varint will save space.

4. move above 3 fields to the trail of the batch headers.

Note: 

The biggest value of varint 2 bytes is 16383

...