Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

So as a starter lets enable verbose:gc on the tests runs and monitor the CPU usage additional monitoring can be added to the suite but the first step is to have the ability to gather the data.

Table of Contents
minlevel3

Data collection

Current goals are to focus on the Java broker by collecting CPU usage (measured via top -p), and GC data as written by the JVM with verbose gc logging. This same information could be gathered for the client JVM however it would require modification to the existing scripts to log the data.

...

The GC log file has a number of types of entries:

No Format
title"Successful Full GC"
0.503: [Full GC (System) 0.503: [CMS: 0K->1454K(63872K), 0.0617910 secs] 8321K->1454K(83008K), [CMS Perm : 10933K->10925K(21248K)], 0.0619320 secs]
No Format
title"ParNew run"
9.351: [GC 9.351: [ParNew: 19119K->2112K(19136K), 0.0141410 secs] 50757K->42135K(83008K), 0.0142560 secs]
No Format
title"The CMS phases"
9.366: [GC [1 CMS-initial-mark: 40023K(63872K)] 42272K(83008K), 0.0016310 secs]
9.367: [CMS-concurrent-mark-start]
9.407: [CMS-concurrent-mark: 0.040/0.040 secs]
9.407: [CMS-concurrent-preclean-start]
9.408: [CMS-concurrent-preclean: 0.001/0.001 secs]
9.408: [CMS-concurrent-abortable-preclean-start]
10.495: [CMS-concurrent-abortable-preclean: 0.113/1.088 secs]
10.498: [CMS-concurrent-sweep-start]
10.508: [CMS-concurrent-sweep: 0.010/0.010 secs]
10.509: [CMS-concurrent-reset-start]
10.517: [CMS-concurrent-reset: 0.008/0.008 secs]
No Format
title"Failed Full GC"
357.779: [Full GC 357.779: [CMS357.885: [CMS-concurrent-abortable-preclean: 0.199/0.990 secs]
 (concurrent mode failure): 961425K->13192K(962444K), 0.2641230 secs] 963545K->13192K(981580K), [CMS Perm : 17808K->17777K(29804K)], 0.2649910 secs]
No Format
title"YG Rescan"
10.496: [GC[YG occupancy: 2151 K (19136 K)]10.496: [Rescan (parallel) , 0.0016840 secs]10.497: [weak refs processing, 0.0007330 secs] [1 CMS-remark: 259108K(260864K)] 261259K(280000K), 0.0025130 secs]

...

There is a risk here that top does not accurately log at the specified rate. However if logging is performed at sub second intervals the effort in extracting the time data from top and calculating the millisecond value for the log entry is not believed to be worth the effort.

Result Presentation

The results of the GC and CPU can be put through gnuplot and graphed here is an example of what it might look like:

Image Added
Image Added
Image Added