How to use it?
(1) start the compose/ozone cluster.
(2) Open the /prof endpoint on any of the webui. For example use http://localhost:9876/prof for profiling Ozone.
(3) Enable kernel parameters as written in the page (and reopen the /prof page)
sudo su - echo 1 > /proc/sys/kernel/perf_event_paranoid echo 0 > /proc/sys/kernel/kptr_restrict
(4) Wait for 10 seconds and check the results in the browser:
(5) What does it show?
The bottom line is the 100% time of the execution. You can see how this 100% of the time is shared between the different method...
You can click to any of the segments to show the details of that specific method execution.
URL parameters
- To collect 1 minute CPU profile of current process and output in tree format (html): http://localhost:9876/prof?output=tree&duration=60
- To collect heap allocation profile of current process (svg): http://localhost:9876/prof?event=alloc
- To collect lock contention profile of current process: http://localhost:9876/prof?event=lock
- Following event types are supported (default is 'cpu') (NOTE: not all OS'es support all events)
- cpu
- page-faults
- context-switches
- cycles
- instructions
- cache-references
- cache-misses
- branches
- branch-misses
- bus-cycles
- L1-dcache-load-misses
- LLC-load-misses
- dTLB-load-misses
- mem:breakpoint
- trace:tracepoint
How to use it with my own cluster?
(1) Download async profiler
(2) Define the path of the async profiler with an environment variable
ASYNC_PROFILER_HOME=/opt/profiler
(3) Enable the /prof servlet:
<property><name>hdds.profiler.endpoint.enabled</name><value>true</value></property>
How does it work?
It's based on a command line profiler: async-profiler. This is a sample-based profiler: it collects multiple the thread dumps over the time and calculate the cpu time for the methods based on statistics.
As a result:
- More accurate time can be get with running it more time
- The overhead is negligible (no instrumentation)
The profiler itself can be used from command linen without any magic. But we adopted a helper servlet from HIVE to make it easier to run. The /prof servlet itself executes the profiler and display the result if it's finished.
How to troubleshoot?
(1) the executed command is display on the /prof servlet. Try to execute it manually.
(2) check the output directory (/tmp/prof-output/)
It it safe?
Only at home, not for production.