Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Status

...

Page properties


Discussion thread

...

...

td49097.html
Vote thread
JIRA

...

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyFLINK-13550

...

Release1.13


Motivation

It is desirable to provide better visibility into the distribution of CPU resources while executing user code. One of the most visually effective means to do that are Flame Graphs. They allow to easily answer question like:

  • Which methods are currently consuming CPU resources?

  • How consumption by one method compares to the others?

  • Which series of calls on the stack led to executing a particular method?

Image Removed


Flame Graphs are constructed by sampling stack traces a number of times. Every method call is presented by a bar, where the length of the bar is proportional to the number of times it is present in the samples.

...

in order to retrieve a “live” ExecutionGraph, proposed implementation can instead utilize an ArchivedExecutionGraph. It is already available in the web monitor endpoint and can be directly used for localizing operator’s Tasks and their corresponding TaskExecutors. ThreadInfoRequestCoordinator can therefore be initialized and executed as part of the WebMonitorEndpoint instead of “polluting” adding non-core functionality to the JobManagerSharedServices with non-core functionality.

Call flow is illustrated by the following sequence diagram (click to zoom):

Image RemovedImage Added

A new method is added to the TaskExecutorGateway interface:

Code Block
languagejava
public interface TaskExecutorGateway extends RpcGateway, TaskExecutorOperatorEventGateway {
    /**
     * Request a thread info sample from the given task.
     *
     * @param taskExecutionAttemptId identifying the task to sample
     * @param requestId of the sample
     * @param numSubSamples to take from the given task
     * @param delayBetweenSamples to wait for
     * @param maxStackTraceDepthrequestParams parameters of the returned samplerequest
     * @param timeout of the request
     * @return Future of stack trace sample response
     */
    CompletableFuture<TaskThreadInfoResponse> requestThreadInfoSamples(
            ExecutionAttemptID taskExecutionAttemptId,
            int requestId,
            int numSubSamples,
            Time delayBetweenSamples,
            int maxStackTraceDepthThreadInfoSamplesRequest requestParams,
            Time timeout);
}

Stack traces are collected and transferred as part of ThreadInfo objects, which contain additional information, such as ThreadState. This allows, in addition to the on-CPU Flame Graphs, to also implement off-CPU Flame Graphs.

...