StreamsMetrics
Table of Contents |
---|
Status
Current state: Under Discussion
...
For the above purposes, we want to 1) cleanup Streams Built-in Metrics to have more out-of-the-box useful metrics while trimming those non-useful ones because the current APIs are not very intuitive from its naming to reason about its semantics (this proposal includes removing some redundant APIs as well as refactoring the parent-child metrics relationships, details below), and 2) improve APIs for User Customized Metrics that let users register them own metrics, based on its "operationName / scopeName / entityName" notions; we would simplify this interface for user's needs, plus making sure it functions correctly.
...
LEVEL 0 | LEVEL 1 | LEVEL 2 | LEVEL 3 | LEVEL 3 | LEVEL 3 | |
Per-Client | Per-Thread | Per-Task | Per-Processor-Node | Per-State-Store | Per-Cache | |
---|---|---|---|---|---|---|
TAGS | type=stream-metrics,client-id=[client-id] | type=stream-thread-metrics,thread-id=[threadId] (! tag name changed) | type=stream-task-metrics,thread-id=[threadId],task-id=[taskId] (! tag name changed) | type=stream-processor-node-metrics,thread-id=[threadId],task-id=[taskId],processor-node-id=[processorNodeId] (! tag name changed) | stream-state-metrics,thread-id=[threadId],thread-name=[taskId],[storeType]-state-id=[storeName] (! tag name changed) | type=stream-record-cache-metrics,thread-id=[threadId],task-id=[taskId],record-cache-id=[storeName] (! tag name changed) |
version | commit-id (static gauge) | INFO ($) | |||||
application-id (static gauge) | INFO ($) | |||||
topology-description (static gauge) | INFO ($) | |||||
state (dynamic gauge) | INFO ($) | |||||
process-latency (avg | max) | INFO | DEBUG | (! removed for now) | |||
process (rate | total) | INFO | DEBUG ( → ) on source-nodes only | DEBUG | |||
punctuate-latency (avg | max) | INFO | DEBUG | ||||
punctuate (rate | total) | INFO | DEBUG | ||||
commit-latency (avg | max) | INFO | DEBUG | ||||
commit (rate | total) | INFO | DEBUG | ||||
poll-latency (avg | max) | INFO | |||||
poll (rate | total) | INFO | |||||
task-created | closed (rate | total) | INFO | |||||
active-process-ratio (dynamic gauge) | INFO ($) (percentage of time the hosting thread is spending with this active task) | |||||
standby-process-ratio (dynamic gauge) | INFO ($) (percentage of time the hosting thread is spending with this standby task) | |||||
dropped-records (rate | total) | INFO ($) (number of records dropped within this task due to all kinds of scenarios) | |||||
enforced-processing (rate | total) | DEBUG | |||||
record-lateness (avg | max) | DEBUG | |||||
suppression-emit (rate | total) | DEBUG * (suppress processor only) | |||||
suppression-buffer-size (avg | max) | DEBUG * (suppression buffer only) | |||||
suppression-buffer-count (avg | max) | DEBUG * (suppression buffer only) | |||||
(put | put-if-absent .. | get)-latency (avg | max) | DEBUG * (excluding suppression buffer) (! name changed) | |||||
(put | put-if-absent .. | get) (rate) | DEBUG * (excluding suppression buffer) (! name changed) | |||||
hit-ratio (avg | min | max) | DEBUG (! name changed) |
...
When users override it to "2.23" or below, then the old metrics names / tags will still be used.
...