Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Therefore it becomes important order to understand the Read I/O pattern of how far away from the tail are consumers fetching the log. This will determine how much physical ram can be allocated for the page cache in order to achieve low latency high through put reads. To determine the memory size for broker, it would be useful to have stats on consumer lag in terms for bytes as well as time. 

The read operation are impacted by two kinds of lag,

  • time (event time provided by the producer) which is how far back in time the offset fetched is aka time lag
  • size of the data that has been produced after what the consumer fetch is being requested for, aka byte lag

Note In many use cases consumers are expected to fetch at the tail end of the log.

...

  • ConsumerFetchLagTimeInMs: Histogram that will measure fetch log lag using the timestamp of the messages being fetched  
  • ConsumerFetchLagBytes: Histogram that will measure the fetch log lag by calculating the byte lag of the message being fetched

Byte lag : The relative position of the offset is determined from the fetch info data from logsegment.read() and then newer segments size is added to determine the lag

Time lag : batch maxTimestamp is used to determine the fetched offset timestamp and activesegment.maxtimestampsofar is used.

The changes have been made at the kafka log layer and the following assumption has been made

  • In order to measure the byte lag newer segments size sum requires O(N) sum operation, the number of segments are assumed to be small (or none) for the large case as the majority consumer fetch pattern is at tail end (recent)

  • Event times have been used for lag measurement which has side effects of user provided timestamp info

Compatibility, Deprecation, and Migration Plan

...