RocksDB has functionality to collect statistics about its operations to monitor running RocksDB's instances. These statistics enable users to find bottlenecks and to accordingly tune RocksDB. RocksDB's statistics can be accessed programmatically via JNI or RocksDB can be configured to periodically dump them to disk. Although RocksDB provides this functionality, Kafka Streams does currently not expose RocksDB's statistics in its metrics. Hence users need to implement Streams' RocksDBConfigSetter to fetch the statistics. This KIP proposes to expose the a subset of the most useful RocksDB's statistics in the metrics of Kafka Streams.

bytes-written-rate [bytes/s]
bytes-written-total [bytes]
bytes-read-rate [bytes/s]
bytes-read-total [bytes]
bytes-flushed-rate [bytes/s]
bytes-flushed-total [bytes]
flush-time-(avg|min|max) [ms]
memtable-hit-rate
block-cache-data-hit-rate
bytes-read-compaction-rate [bytes/s]
bytes-written-compaction-rate [bytes/s]
compaction-time-(avg|min|max) [ms]
write-waitingstall-timeduration-(avg|total) [ms]
num-open-files
num-file-errors-total

Proposed Changes

In this section, I will explain the meaning of the metrics listed in the previous section and why I chose them. Generally, I tried to choose the metrics that are useful independently of not specific configurations of the RocksDB instances. Furthermore, I tried to keep the number of metrics at a minimum, because adding metrics in future is easier than deleting metrics from a backward-compatibility point of view.

bytes-written-(rate|total)

...

When data is read from RocksDB, the memtable is consulted firstly to find the data. This metric measures the number of hits with respect to the number of all lookups into the memtable. Hence, the formula for this metric is hits/(hits + misses).

If A low memtable-hit-rate is to high for the a given workload , the memtable may be might indicate a too small memtable.

block-cache-data-hit-rate

If data is not found in the memtable, the block cache is consulted. This metric measures the number of hits for data blocks with respect to the number of all lookups for data blocks into the block cache. The formula for this metric is the same as equivalent to the one for memtable-hit-rate.

If A low block-cache-data-hit-rate is to high for the a given workload , the block-cache-hit-rate needs maybe some tuningmight indicate a too small block cache.

bytes-read-compaction-rate, bytes-written-compaction-rate, and compaction-time-(avg|min|max)

...

The metrics should help to identify compactions as bottlenecks.

write-

...

stall-

...

duration(avg|total)

As explained above, from time to time RocksDB flushes data from the memtable to disk and reorganises data on the disk with compactions. During flushes Flushes and compactions a write might stall writes to the database might , hence the writes need to wait until these processes finish. These metrics measure the average and total waiting time duration of a write process until flush and compaction finishstalls.

If flush and compaction happen too often and stall writes this time may will increase and signal a bottleneck.

...

Part of the data in RocksDB is kept in files. This files need to be opened and closed. Metric num-open-files measures the number of currently open files and metric num-file-errors-total measures the number of file errors. Both metrics may help to find issues connected to OS and , file systems.

Compatibility, Deprecation, and Migration Plan

Since metrics are only added and no other metrics are modified, this KIP should not

...

Metrics bytes-read-compaction-total and bytes-written-compaction-total did not seem useful to me since they would measure bytes moved between memory and disk due to compaction. The metric bytes-flushed-total gives at least a feeling about the size of the persisted data in the RocksDB instance.The number of timed-out writes would

Space shortcuts

Child pages

Versions Compared

Old Version 5

New Version 6

Key

Proposed Changes

bytes-written-(rate|total)

block-cache-data-hit-rate

bytes-read-compaction-rate, bytes-written-compaction-rate, and compaction-time-(avg|min|max)

write-

stall-

duration(avg|total)

Compatibility, Deprecation, and Migration Plan

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 5

New Version 6

Key

Proposed Changes

bytes-written-(rate|total)

block-cache-data-hit-rate

bytes-read-compaction-rate, bytes-written-compaction-rate, and compaction-time-(avg|min|max)

write-

stall-

duration(avg|total)

Compatibility, Deprecation, and Migration Plan