Enable Metrics
In order to enable the metrics to be published as endpoint comment in the servlet in the web.xml
See: /openmeetings-web/src/main/webapp/WEB-INF/web.xml#L110
View your metrics in Prometheus
Above will enable an endpoint to publish the metrics as HTTP endpoint. You can then use Prometheus (or other tools) to read it in and graph it.
Easiest is to point to it via a Prometheus that runs in a docker container. How to startup the docker container:
docker run --rm -it -p 9090:9090 -v /Users/Sebastian.wagner/Documents/mywork/openmeetings/_REPO/copy-files/ :/etc/prometheus/prometheus.yml prom/prometheus
Example local-prometheus.yml file to reference in above docker command (points to local running OpenMeetings instance)
Change 192.168.1.66:5080 to your local running OpenMeetings instance. Below graphs and queries are done using this docker container.
What kind of metrics are we collecting
Majority of the metrics collected (specially the below ones) are of the Prometheus type "histogram" https://prometheus.io/docs/practices/histograms/
A histogram automatically both:
- Duration (called $metricName_sum)
- Count
Initially you may find it's not quite trivial to calculate the duration for individual calls. But it makes sense once you review below. What we are interested is "rate pre min" and similar statistics. Not an individual call, but rate/average over a certain period of time.
How and what to measure in metrics in OpenMeetings
There are roughly 4 categories of Metrics enabled in OpenMeetings
Tomcat statistics
Basic Tomcat generics see: /openmeetings-web/src/main/java/org/apache/openmeetings/web/util/logging/TomcatGenericExports.java
Example active_threads
HTTP request metrics -including Web Service calls- via ServletFilter
All Servlet calls, including the WebService calls are available automatically in Prometheus. They are collected by a ServletFilter and published as metrics.
Once collected in Prometheus you can filter all statistics and run queries on long running or size.
Example metrics count web service calls
Below query returns the total count of all calls using the query (below uses the wildcard on the path attribute to filter web service calls out, you could filter further if required)
webapp_metrics_filter_count{path=~"/openmeetings/services/.+"}
Example metrics web service calls duration over 1 min time window and graphed
In order to get the duration for those calls you would take the sum (by a certain time period) divided by the count (by the same time period) and what you would get is:
Average duration within a 1min time window, for each of the calls => and graph is and you can adjust the time window also
rate(webapp_metrics_filter_sum{path=~"/openmeetings/services/.+"}[1m]) / rate(webapp_metrics_filter_count{path=~"/openmeetings/services/.+"}[1m])
Application metrics based on annotations in Spring Beans
For Spring Beans I've added an annotation that uses spring-aop to inject bytecode/intercept the start and end of the method (see: /openmeetings-util/src/main/java/org/apache/openmeetings/util/logging/PrometheusAspect.java)
So for any Spring Bean, you can just annotate the method with the 2 annotations I created:
- @TimedDatabase
- @TimedApplication
I created two, cause that way you can filter them and measure database queries. (There would be a way to use JDBC interceptors, but it's quite complicated with openJPA).
See for example some annotated methods in UserDao: /openmeetings-db/src/main/java/org/apache/openmeetings/db/dao/user/UserDao.java#L626
Example metrics type database duration over 1 min rate
/
rate(org_openmeetings_metrics_count{type="database"}[1m])
You can select different ones via the legend on the bottom.
Example metrics type application durations 1 min rate
rate(org_openmeetings_metrics_sum{type="application"}[1m]) / rate(org_openmeetings_metrics_count{type="application"}[1m])
Application metrics based on manual metric
Unfortunate not all places in the code are spring beans. Also sometimes we have inner classes or similar issues.
In that case I've added a Util that you can add at the start and end of the Method invocation, to create another metric.
At the start
Histogram.Timer timer = PrometheusUtil.getHistogram() // .labels("RoomPanel", "onInitialize", "application").startTimer(); try {
At the end
} finally { timer.observeDuration(); }
Example - narrow down the filter of the metric in order to just plot this method
rate(org_openmeetings_metrics_sum{type="application",class="RoomPanel",method="onInitialize"}[1m]) / rate(org_openmeetings_metrics_count{type="application",class="RoomPanel",method="onInitialize"}[1m])