Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Current state: Under Discussion

Discussion thread: here [Change the link from the KIP proposal email archive to your own email thread]

JIRA: If the idea is approved then I will create the Jirahere [Change the link from KAFKA-1 to your own ticket]


Motivation

The main motivation is to have a clear metric (in spite of the OS) to see when the produce requests become "async" .In a normal situation the produce requests will be written to disk via teh lib->syscall->etc.., as we know this will end up in a memory page (dirty page from now on)

...

I can play using the dirty_ratio and background_dirty_radio values.


Rejected Alternatives

The best alternative IMHO would be to get the information before "the disaster happens" so at OS level we can check the nr_dirty and the  nr_dirty_threshold

nr_dirty is the amount of current dirty pages and nr_dirty_threshold is the limit when the OS will block the writes in the pages until some are flushed.

Having this relation could give us a hint when we are getting closer to the limit and add more resources or tune the OS settings.

This is possible as an "in house" metric but not for Kafka as it runs in the JVM and only god know in which OS (smile) If there are alternative ways of accomplishing the same thing, what were they? The purpose of this section is to motivate why the design is the way it is and not some other way.