Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

and likewise for the other operator-level end-to-end latencies. This represents the age of the record at the time is was processed by operator O. The task-level end-to-end (e2e) latency L will be computed based on the sink node, ie = LI. The source node e2e latency reading from the user input topics therefore represent the consumption latency, the time it took for a newly-created event to be read by Streams. This can be especially interesting in cases where some records may be severely delayed: for example by a IoT device with unstable network connections, or when a user's smartphone reconnects to the internet after a flight and pushes all the latest updates. On the other side, the sink node e2e latency – which is also the task-level e2e latency, reveals how long it takes for the record to be fully processed through that subtopology. If the task is the final one in the full topology, this is the full end-to-end latency of the time it took for a record to be fully processed through Streams.

Note that for a given record,  LO <= LA <= L. This holds true within and across subtopologies. A downstream subtopology will always have a task-level end-to-end latency greater than or equal to that of an upstream subtopology for a single task (which in turn implies the same holds true for the statistical measures exposed via the new metrics). Comparing the e2e latency across tasks (or across operators) will also be of interest as this represents the processing delay: the amount of time it took for Streams to actually process the record from point A to point B within the topology. 

...

  1. Averages do not seem to convey any particularly useful information
  2. We couldn't agree on which one to use
  3. https://youtu.be/lJ8ydIuPFeU?t=786