Troubleshooting CPU
Description
When an application becomes CPU-bound, it no longer processes interrupts efficiently.
Determination
One way to determine whether a machine or process is CPU-bound is to use an operating system command such as vmstat
or top
while the application is running.
Operating System Command
vmstat
The vmstat
output below shows that the CPUs are 99% idle.
...
The top
output shows, among other things, CPU usage percentage. The output below shows that the CPUs are mostly in use (idle=3.0%) and that Java processes using most of that CPU.
Code Block |
---|
top |
...
12:49: |
...
24 up 113 days, 23:36, 35 users, |
...
load average: 10.40, 5.20, 2.30 615 processes: 587 sleeping, 27 running, 1 zombie, 0 stopped CPU states: |
...
cpu user nice system irq softirq iowait idle total 61.7% 0.0% 31.4% 0.5% 2.5% 0.4% 3.0% PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM |
...
TIME CPU COMMAND 22523 user1 15 |
...
|
...
0 1102M 1.1G 18068 |
...
R 3.6 14. |
...
1 0: |
...
24 1 java 22778 user1 |
...
15 |
...
0 1102M 1.1G 18068 R |
...
2.1 14. |
...
1 0: |
...
02 1 java 22682 user1 15 |
...
|
...
0 1102M 1.1G 18068 R |
...
1.4 14. |
...
1 0: |
...
07 1 java 22698 user1 15 |
...
|
...
0 1102M 1.1G 18068 R |
...
1.4 14. |
...
1 0: |
...
10 0 java 19286 user1 |
...
15 |
...
0 1100M 1.1G 18080 |
...
R 0.5 14. |
...
1 0: |
...
25 0 java |
vsd
Another way to determine whether a machine or process is CPU-bound is to use vsd
to display active CPU and process CPU time values contained in a given Geode statistics archive.
LinuxSystemStats
The chart below shows LinuxSystemStats cpuActive values. This machine is CPU-bound.
VMStats
The chart below shows VMStats processCpuTime values.
gfsh
The gfsh
show metrics command can be used to show the active CPU (cpuUsage) of a member. An example is:
...
show metrics --member=server1 --categories=member Member Metrics Category | Metric | Value -------- | --------------- | ----------------- member | upTime | 14 | cpuUsage | 84.13760375976562 | currentHeapSize | 2200 | maximumHeapSize | 81250
Action
Determining that there is a CPU issue is one thing. Finding the source of the issue is another. One thing that can be done is to dump the JVM threads using the operating system kill -3
command as shown below. These dumps will show you how many threads there are and what each thread is doing. Often, application issues can be found by examining these thread dumps.
...