You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Broker Statistics

The following statistics are provided as a basic useful minimum.

  1. Peak bytes per second
  2. Current bytes per second
  3. Total bytes
  4. Peak messages per second
  5. Current messages per second
  6. Total messages

These statistics will be generated for both delivered and received messages, per connection and aggregated per virtualhost and also for the entire broker. Delivered messages are those that have been sent to a client by the broker, and may be counted more than once die to rollbacks and re-delivery. Received messages are those that are published by a client and received by the broker.

Mechanism

The statistics are generated using a sample counter, which is thread-safe and is triggered on incoming message events. The totals for messages and data (bytes) are recorded as type long and the rates are recorded as double since they may be fractional.

The default sample period is one second, during which events are recorded cumulatively. This can be changed for all counters in the broker by setting the system property qpid.statistics.samplePeriod. Basically, rates are calculated as an average over this sample period, so if the traffic is very bursty, a small sample period will fail to capture events most of the time, and you will end up with inflated peak rates. Since we are calculating rates per-second, the sample period is 1000ms as a default, but if this is increased to, say, 10000ms then messages will be counted over ten second periods, and the resulting totals divided by ten to give the rate.

Since qpid.statistics.samplePeriod is the cut-off point when the rates are calculated, stored, and then re-set, changing this property should affect performance. Increasing the value should improve throughput, decreasing should reduce it.

When all statistics generation is enabled, there will be 12 counters that are used, since statistics are calculated for the product of the following sets:

{ Broker, Virtualhost, Connection } x { Delivery, Receipt } x { Messages, Data }

Reporting

If statistics are being generated for the broker or for virtualhosts then periodic reporting can be enabled. This is generated at specified intervals, with the option to reset the statistics after outputting the current data.

[Statistics-Reporting] BRK-1008 : received : 4.395 kB/s peak : 4500 bytes total
[Statistics-Reporting] BRK-1008 : delivered : 4.395 kB/s peak : 4500 bytes total
[Statistics-Reporting] BRK-1009 : received : 45 msg/s peak : 45 msgs total
[Statistics-Reporting] BRK-1009 : delivered : 45 msg/s peak : 45 msgs total
[Statistics-Reporting] VHT-1003 : localhost : received : 1.465 kB/s peak : 1500 bytes total
[Statistics-Reporting] VHT-1003 : localhost : delivered : 1.465 kB/s peak : 1500 bytes total
[Statistics-Reporting] VHT-1004 : localhost : received : 15 msg/s peak : 15 msgs total
[Statistics-Reporting] VHT-1004 : localhost : delivered : 15 msg/s peak : 15 msgs total
[Statistics-Reporting] VHT-1003 : test : received : 0.977 kB/s peak : 1000 bytes total
[Statistics-Reporting] VHT-1003 : test : delivered : 0.977 kB/s peak : 1000 bytes total
[Statistics-Reporting] VHT-1004 : test : received : 10 msg/s peak : 10 msgs total
[Statistics-Reporting] VHT-1004 : test : delivered : 10 msg/s peak : 10 msgs total
[Statistics-Reporting] VHT-1003 : development : received : 1.953 kB/s peak : 2000 bytes total
[Statistics-Reporting] VHT-1003 : development : delivered : 1.953 kB/s peak : 2000 bytes total
[Statistics-Reporting] VHT-1003 : development : received : 1.953 kB/s peak : 2000 bytes total
[Statistics-Reporting] VHT-1003 : development : delivered : 1.953 kB/s peak : 2000 bytes total
[Statistics-Reporting] VHT-1004 : development : received : 20 msg/s peak : 20 msgs total
[Statistics-Reporting] VHT-1004 : development : delivered : 20 msg/s peak : 20 msgs total

Sample statistics report log messages

The reported statistics are output using the operational logging mechanism, and can therefore be extracted from the log files using the same mechanisms as are currently used for other such messages. The new message types are as follows:

  • BRK-1008 - Broker data statistics (peak bytes per second and total bytes)
  • BRK-1009 - Broker message statistics (peak messages per second and total messages)
  • VHT-1003 - Virtualhost data statistics (peak bytes per second and total bytes)
  • VHT-1004 - Virtualhost message statistics (peak messages per second and total messages)

Configuration

The statistics generation will be configured via the config.xml broker configuration mechanism.

config.xml
<statistics>
    <generation>
        <!-- default false -->
        <broker>true</broker>
        <virtualhosts>true</virtualhosts>
        <connections>true</connections>
    </generation>
    <reporting>
        <period>3600</period><!-- seconds -->
        <reset>true</reset>
    </reporting>
</statistics>

It is not possible to enable statistics generation for a single virtualhost - all virtualhosts will generate statistics if the //statistics/generation/virtualhosts element is set to true.

It only makes sense to enable reporting if either broker or virtualhosts have statistics generation enabled also. However, the broker will simply ignore the reporting configuration if no statistics are being generated. If the //statistics/reporting/reset element is set to true then after reporting on the statistics in the log, the statistics will be reset to zero for the entire broker.

Even if statistics generation is completely disabled, it is still possible to activate statistics on an individual connection, while the broker is running. The JMX attribute statisticsEnabled on a connection MBean can be set to true which will start statistics generation, and totals and rates will be calculated from this point onward, or until it is set to false again.

Additionally, the following two system properties can be set to configure the statistics counter:

  • qpid.statistics.samplePeriod - sample period in ms - default 1000
  • qpid.statistics.disable - override configuration and disable statistics collection - default false (not set)

These two properties are exposed on the StatisticsCounter class as static fields DEFAULT_SAMPLE_PERIOD and DISABLE_STATISTICS.

Design

StatisticsCounter Class

This implements the counting of event data and generates the total and rate statistics. There should be one instance of this class per type of statistic, such as messages or bytes. The instance methods that are called to add an event are:

  • void registerEvent()
    Registers a single event at the current system time, such as the arrival of a message.
  • void registerEvent(long value)
    Registers a cumulative value, such as the size of a message, at the current system time.
  • void registerEvent(long value, long timestamp)
    Registers a cumulative value at the specified time. As with the previous method, this could be 1, which will allow timestamped single events to be registered if the no-argument version is not sufficient.

There are three constructors:

  • StatisticsCounter()
  • StatisticsCounter(String name)
  • StatisticsCounter(String name, long samplePeriod)

These are chained in that order, using a default name of counter and the default sample period of 2000 ms or set in the qpid.statistics.samplePeriod property.

To retrieve the data, there are methods to return the current rate, peak rate and total, as well as the start time, sample period and name of the counter, and also a method to reset the counter.

StatisticsGatherer Interface

This is implemented by the broker business object that is generating statistics. It provides the following methods:

  • void initialiseStatistics()
  • void registerMessageReceived(long messageSize, long timestamp)
  • void registerMessageDelivered(long messageSize)
  • StatisticsCounter getMessageDeliveryStatistics()
  • StatisticsCounter getDataDeliveryStatistics()
  • StatisticsCounter getMessageReceiptStatistics()
  • StatisticsCounter getDataReceiptStatistics()
  • void resetStatistics()
  • void setStatisticsEnabled(boolean enabled)
  • boolean isStatisticsEnabled()

These statistics are exposed using the separate JMX Mbean interfaces detailed below, which calls these methods to retrieve the underlying StatisticsCounter objects and return their attributes. This interface gives a standard way for parts of the broker to set up and configure statistics generation.

When creating these objects, there should be a parent/child relationship between them, normally from broker to virtualhost to connection. This means that the lowest level gatherer can record statistics (if enabled) and also pass on the notification to its parent object to allow higher level aggregation of statistics. When resetting statistics, this works in the opposite direction, with higher level gatherers also resetting all of their child objects. Note that this parent/choild relationship is never explicitly specified, and is dependant on the implementation of registerMessageDelivery and resetStatistics in order to allow more flexibility.

JMX Interface

The Qpid JMX interface level that supports statistics is 1.9. Each object (MBean) that can generate statistics will add the following attributes and operations:

Operation

Description

resetStatistics

Reset the statistics counters for this object, and all children.

Attribute

Permissions

Description

peakMessageDeliveryRate

R

Peak number of messages delivered per second

peakDataDeliveryRate

R

Peak number of bytes delivered per second

messageDeliveryRate

R

Current number of messages delivered per second

dataDeliveryRate

R

Current number of bytes delivered per second

totalMessagesDelivered

R

Total number of messages delivered since start or reset

totalDataDelivered

R

Total number of bytes delivered since start or reset

peakMessageReceiptRate

R

Peak number of messages received per second

peakDataReceiptRate

R

Peak number of bytes received per second

messageReceiptRate

R

Current number of messages received per second

dataReceiptRate

R

Current number of bytes received per second

totalMessagesReceived

R

Total number of messages received since start or reset

totalDataReceived

R

Total number of bytes received since start or reset

statisticsEnabled

R (W Connections only)

Is statistics generation enabled or disabled

The following MBeans have had statistics attributes added:

  • ServerInformation
    The entire broker, statistics for all virtualhosts. The totals for the broker will be equal to the sum of the totals for all virtualhosts. Rates for the broker include messages from all virtualhosts, so the broker peak can be higher than individual virtualhosts, and possibly also higher than the sum, due to slight timing differences. This is to be expected.
    org.apache.qpid:type=ServerInformation,name=ServerInformation,*
  • ManagedBroker
    This is just a misleadingly named virtualhost. Note that the sum of all connections statistics will not always equal the statistics for the virtualhost, as connection MBeans are discarded when the connection is closed.
    org.apache.qpid:type=VirtualHost.VirtualHostManager,VirtualHost=*,*
  • ManagedConnection
    Individual connections, which can be selected on a per-virtualhost basis. The name of a connection is usually the Java toString() representation of the address of the remote internet socket making the connection, so it is usually not possible to determine these in advance. These allow enabling and disabling of statistics generation.
    org.apache.qpid:type=VirtualHost.Connection,VirtualHost=*,name=*

The JMX attributes that record statistics are always present, and will have a value of 0/0.0 if generation is not enabled, and the statisticsEnabled attribute will be set to false.

The qpid.statistics.disable system property will override all configuration on the broker. All attributes will be reported as 0/0.0 and JMX will show statistics as being disabled via the statisticsEnabled property, although it will be possible to selectively enable generation on a per-connection basis.

Usage

The JMX JConsole application can be used to view the attributes, both as discrete values or as graphs. Sample output is shown below, illustrating the virtualhost statistics for a broker with a producer and a consumer attached and sending messsages.

Unable to render embedded object: File (jconsole-statistics-zero.png) not found.
Virtualhost statistics attributes at broker startup

Unable to render embedded object: File (jconsole-statistics-graphs.png) not found.
Virtualhost statistics graphs and attributes after sending and receiving messages through the broker

Testing

One caveat related to testing is that these statistics, apart from totals, are by their nature time dependent, and subject to race conditions, where sending messages as part of a test can overlap two sample periods, instead of one, causing seemingly incorrect results.

Unit

The StatisticsCounter class is suitable for unit testing. Tests checking class creation and operation check the total, rate and peak data, as well as other class properties. It is difficult to write a test to provably show thread safety when updating the counter with new events, but there are tests to check that out-of-order events will not cause problems with the data generated.

System

System testing covers the generation of statistics using a running broker, as well as the configuration mechanisms and the operation of the JMX interface to the data and the operational logging output. The following system tests have been written, all using a shared parent test case class, MessageStatisticsTestCase which sets up the JMX connection to the broker.

In order for these JMX based tests to work correctly with the default test profile, a fix had to be back-ported from trunk which disables the use of a custom ServerSocketFactory in the JMX registry. The property that disables this is set only in the JMXUtils helper class, and should not be used in normal broker operation.

  • MessageStatisticsTest
  • MessageStatisticsDeliveryTest
  • MessageStatisticsReportingTest
  • MessageConnectionStatisticsTest
  • MessageStatisticsConfigurationTest

These tests can be run with the following command, from the systests module directory:
$ ant test -Dtest=Message*Statistics*Test

Performance

To measure performance, a release of 2.6.0.7 was generated, using the standard release mechanism. The data should be compared with the previous [results] for the 2.6.0.6 release. The performance test suite was run on noc-qpiddev02.jpmorganchase.com with the broker running on noc-qpiddev01.jpmorganchase.com. Statistics were enabled on the broker by editing the etc/persistent_config.xml file to set statistics generation to true for the broker, virtualhosts and connections, as well as to report every hour (without resetting the statistics). The Disabled configuration used the provided default configuration file, and the Broker Only configuration used the same as the first set of tests, but with the //statistics/generation/virtualhosts and //statistics/generation/connections elements set to false.

The following command was executed for each of the configurations, and the results are aggregated into the tables below:
$ ./Throughput.sh

Accuracy

The measured standard deviation of these results is 2.2% (about 500) of the mean (about 23000), based on ten sample runs of the TTBT-TX test with no statistics generation enabled. For the TTBT-NA test, a set of [102 test runs|^throughput.csv] indicated that the standard deviation was around 0.8% of the mean, with a range of 3.8% of the mean between the minimum and maximum result.

Initial Results

These results use the code from SVN revision r1033077, annd were generated on the 09 and 10 November 2010.

Throughput Numbers

Unknown macro: {table}
Unknown macro: {tr}
Unknown macro: {td}

Test
 

Unknown macro: {td}

Thoughput
 

Unknown macro: {td}

Throughput
Broker Only

Unknown macro: {td}

Throughput
Disabled

Unknown macro: {tr}
Unknown macro: {td}

PQBT-AA-Qpid-01

Unknown macro: {td}

2937.3007

Unknown macro: {td}

0

Unknown macro: {td}

3239.986

Unknown macro: {tr}
Unknown macro: {td}

PQBT-TX-Qpid-01

Unknown macro: {td}

3142.3297

Unknown macro: {td}

0

Unknown macro: {td}

3240.4037

Unknown macro: {tr}
Unknown macro: {td}

PTBT-AA-Qpid-01

Unknown macro: {td}

3817.1709

Unknown macro: {td}

3897.9127

Unknown macro: {td}

3941.363

Unknown macro: {tr}
Unknown macro: {td}

PTBT-TX-Qpid-01

Unknown macro: {td}

3971.9589

Unknown macro: {td}

0

Unknown macro: {td}

4087.7404

Unknown macro: {tr}
Unknown macro: {td}

TQBT-AA-Qpid-01

Unknown macro: {td}

52299.27

Unknown macro: {td}

0

Unknown macro: {td}

53195.957

Unknown macro: {tr}
Unknown macro: {td}

TQBT-NA-Qpid-01

Unknown macro: {td}

50979.767

Unknown macro: {td}

48090.19

Unknown macro: {td}

51696.26

Unknown macro: {tr}
Unknown macro: {td}

TQBT-TX-Qpid-01

Unknown macro: {td}

7321.2233

Unknown macro: {td}

0

Unknown macro: {td}

7438.8633

Unknown macro: {tr}
Unknown macro: {td}

TTBT-AA-Qpid-01

Unknown macro: {td}

86934.654

Unknown macro: {td}

0

Unknown macro: {td}

92742.66

Unknown macro: {tr}
Unknown macro: {td}

TTBT-NA-Qpid-01

Unknown macro: {td}

103829.22

Unknown macro: {td}

0

Unknown macro: {td}

128478.24

Unknown macro: {tr}
Unknown macro: {td}

TTBT-TX-Qpid-01

Unknown macro: {td}

21685.606

Unknown macro: {td}

0

Unknown macro: {td}

22985.97

There was no obvious difference between broker only and all statistics, so that testing was discontinued.

Comparison with previous release:

Unknown macro: {table}
Unknown macro: {tr}
Unknown macro: {td}

Test
 

Unknown macro: {td}

2.6.0.6
 

Unknown macro: {td}

2.6.0.7
 

Unknown macro: {td}

Change
 

Unknown macro: {td}

2.6.0.7
Disabled

Unknown macro: {td}

Change
Disabled

Unknown macro: {tr}
Unknown macro: {td}

PQBT-AA-Qpid-01

Unknown macro: {td}

3214.728

Unknown macro: {td}

2937.3007

Unknown macro: {td}

-8.63%

Unknown macro: {td}

3239.986

Unknown macro: {td}

0.79%

Unknown macro: {tr}
Unknown macro: {td}

PQBT-TX-Qpid-01

Unknown macro: {td}

3224.1871

Unknown macro: {td}

3142.3297

Unknown macro: {td}

-2.54%

Unknown macro: {td}

3240.4037

Unknown macro: {td}

0.50%

Unknown macro: {tr}
Unknown macro: {td}

PTBT-AA-Qpid-01

Unknown macro: {td}

4069.4785

Unknown macro: {td}

3817.1709

Unknown macro: {td}

-6.20%

Unknown macro: {td}

3941.363

Unknown macro: {td}

-3.15%

Unknown macro: {tr}
Unknown macro: {td}

PTBT-TX-Qpid-01

Unknown macro: {td}

4088.1968

Unknown macro: {td}

3971.9589

Unknown macro: {td}

-2.84%

Unknown macro: {td}

4087.7404

Unknown macro: {td}

-0.1%

Unknown macro: {tr}
Unknown macro: {td}

TQBT-AA-Qpid-01

Unknown macro: {td}

51871.49

Unknown macro: {td}

52299.27

Unknown macro: {td}

0.82%

Unknown macro: {td}

53195.957

Unknown macro: {td}

2.55%

Unknown macro: {tr}
Unknown macro: {td}

TQBT-NA-Qpid-01

Unknown macro: {td}

54278.095

Unknown macro: {td}

50979.767

Unknown macro: {td}

-6.08%

Unknown macro: {td}

51696.26

Unknown macro: {td}

-4.76%

Unknown macro: {tr}
Unknown macro: {td}

TQBT-TX-Qpid-01

Unknown macro: {td}

7496.785

Unknown macro: {td}

7321.2233

Unknown macro: {td}

-2.34%

Unknown macro: {td}

7438.8633

Unknown macro: {td}

-0.77%

Unknown macro: {tr}
Unknown macro: {td}

TTBT-AA-Qpid-01

Unknown macro: {td}

94392.1

Unknown macro: {td}

86934.654

Unknown macro: {td}

-7.90%

Unknown macro: {td}

92742.66

Unknown macro: {td}

-1.75%

Unknown macro: {tr}
Unknown macro: {td}

TTBT-NA-Qpid-01

Unknown macro: {td}

129089.54

Unknown macro: {td}

103829.22

Unknown macro: {td}

-19.57%

Unknown macro: {td}

128478.24

Unknown macro: {td}

-0.47%

Unknown macro: {tr}
Unknown macro: {td}

TTBT-TX-Qpid-01

Unknown macro: {td}

23134.289

Unknown macro: {td}

21685.606

Unknown macro: {td}

-6.26%

Unknown macro: {td}

22985.97

Unknown macro: {td}

-0.64%

Unknown macro: {tr}
Unknown macro: {td}
Unknown macro: {td}

Average

Unknown macro: {td}

-6.84%

Unknown macro: {td}

Average

Unknown macro: {td}

-0.77%

Analysis

As can be seen, this results in an average of around 7% reduction in throughput with all statistics generation enabled, and less that 1% reduction with them disabled. One anomaly that stands out is the TTBT_NA test (transient, non-acknowledge messages) which shows a nearly 20% decrease in throughput, which warrants further investigation. However this is obviously too great a penalty, and will require code changes and further investigation and testing.

Further testing with different combinations of broker, virtualhost and connection statistics enabled will need to be carried out to determine where the greatest penalty in performance lies. Based on this, there may be changes required to the StatisticsCounter class, or the propagation mechanism that links child and parent counter objects.

It is also important to determine what the maximum acceptable loss of throughput is to a customer. It would seem desirable to have less that 5% decreate in throughput due to statistics generation, however it should be noted that a performance penalty is inevitable when adding code to the message delivery and receipt paths in the broker, and the aim is only to minimise this, not to eliminate it.

Further Testing

The 7% performance loss was felt to be unacceptable, so modifications were made to the StatisticsCounter code to reduce the time spend carrying out computations. The following changes were made:

  1. The system property qpid.statistics.samplePeriod was changed from 1000 to 2000 and 5000.
  2. The code in StatisticsCounter was modified to defer more calculation to the end of a sample period.

Throughput Numbers

The same 2.6.0.7 broker with QPID_OPTS set to -Dqpid.statistics.samplePeriod=N where N is 2000 or 5000.

Unknown macro: {table}
Unknown macro: {tr}
Unknown macro: {td}

Test
 

Unknown macro: {td}

2.6.0.6
 

Unknown macro: {td}

2.6.0.7
2000

Unknown macro: {td}

Change
2000

Unknown macro: {td}

2.6.0.7
5000

Unknown macro: {td}

Change
5000

Unknown macro: {tr}
Unknown macro: {td}

PQBT-AA-Qpid-01

Unknown macro: {td}

3214.728

Unknown macro: {td}

3198.1702

Unknown macro: {td}

-0.52%

Unknown macro: {td}

2970.78

Unknown macro: {td}

-7.59%

Unknown macro: {tr}
Unknown macro: {td}

PQBT-TX-Qpid-01

Unknown macro: {td}

3224.1871

Unknown macro: {td}

3214.087

Unknown macro: {td}

-0.31%

Unknown macro: {td}

2875.1144

Unknown macro: {td}

-10.82%

Unknown macro: {tr}
Unknown macro: {td}

PTBT-AA-Qpid-01

Unknown macro: {td}

4069.4785

Unknown macro: {td}

4030.176

Unknown macro: {td}

-0.97%

Unknown macro: {td}

3872.7224

Unknown macro: {td}

-4.83%

Unknown macro: {tr}
Unknown macro: {td}

PTBT-TX-Qpid-01

Unknown macro: {td}

4088.1968

Unknown macro: {td}

4175.8924

Unknown macro: {td}

2.15%

Unknown macro: {td}

3962.1336

Unknown macro: {td}

-3.08%

Unknown macro: {tr}
Unknown macro: {td}

TQBT-NA-Qpid-01

Unknown macro: {td}

54278.095

Unknown macro: {td}

52546.314

Unknown macro: {td}

-3.19%

Unknown macro: {td}

48368.572

Unknown macro: {td}

-10.89%

Unknown macro: {tr}
Unknown macro: {td}

TTBT-TX-Qpid-01

Unknown macro: {td}

23134.289

Unknown macro: {td}

22086.33

Unknown macro: {td}

-4.53%

Unknown macro: {td}

18947.935

Unknown macro: {td}

-18.10%

Unknown macro: {tr}
Unknown macro: {td}
Unknown macro: {td}

Average

Unknown macro: {td}

-1.23%

Unknown macro: {td}

Average

Unknown macro: {td}

-9.22%

Old is an unmodified 2.6.0.7 broker, the new broker has QPID_OPTS set to -Dqpid.statistics.samplePeriod=5000 and some changes to the StatisticsCounter class that move computation to the end of the sample period, all three have QPID_JAVA_MEM set to -Xmx2g. Results are the average of three runs.

Unknown macro: {table}
Unknown macro: {tr}
Unknown macro: {td}

TTBT-NA-Qpid-01

Unknown macro: {td}

Result

Unknown macro: {td}

Change

Unknown macro: {tr}
Unknown macro: {td}

2.6.0.6

Unknown macro: {td}

113206

Unknown macro: {td}

0.0%

Unknown macro: {tr}
Unknown macro: {td}

2.6.0.7 Old

Unknown macro: {td}

103077

Unknown macro: {td}

-8.9%

Unknown macro: {tr}
Unknown macro: {td}

2.6.0.7 New 5s

Unknown macro: {td}

106202

Unknown macro: {td}

-6.1%

Unknown macro: {tr}
Unknown macro: {td}

2.6.0.7 New 10s

Unknown macro: {td}

104196

Unknown macro: {td}

-8.0%

More Tests

Two 5 minute runs of a simple test, producing and consuming messages as fast as possible in a single thread. This was run on a HP DL585, Quad dual-core 2.2GHz Opteron, 8Gb, hence the difference in reported rates.

  • Statistics Configuration Enabled
    $ QPID_OPTS=-Dqpid.statistics.samplePeriod=2500 QPID_JAVA_MEM=-Xmx4g ./qpid-server
    3792.9 msgs/s
  • Statistics Configuration Enabled
    $ QPID_OPTS=-Dqpid.statistics.samplePeriod=500 QPID_JAVA_MEM=-Xmx4g ./qpid-server
    3961.8 msgs/s
  • Statistics Configuration Enabled
    $ QPID_OPTS=-Dqpid.statistics.samplePeriod=30000 QPID_JAVA_MEM=-Xmx4g ./qpid-server
    3999.3 msgs/s
  • Statistics Configuration Enabled
    $ QPID_OPTS=-Dqpid.statistics.disable=true QPID_JAVA_MEM=-Xmx4g ./qpid-server
    4156.3 msgs/s
  • Stock Apache 0.5 Download - http://apache.mirror.anlx.net//qpid/0.5/qpid-java-broker-0.5.tar.gz
    $ QPID_JAVA_MEM=-Xmx4g ./qpid-server
    4068.3 msgs/s

These results would seem to indicate that the act of generating statistics is causing the slowdown, rather than updating the peak value at the end of a sample period. Additionally, the results for the single-threaded testing are much closer, suggesting that lock contention with many connections is an issue.

Profiling

To properly determine hot-spots in the code, a profiling run was made using the YourKit profiler. The simple test used in the previous section was run with the profiler attached, and CPU statistics were collated. The profiler detected the following hotspots:

  • StatisticsCounter.registerEvent - 4%
  • AtomicLong.addAndGet - 3%
  • AtomicLong.compareAndSet - 2%

Since the AtomicLong calls are made inside registerEvent, it seems that this is where the CPU is spending the extra time that is causing the slowdown. In order to mitigate this, the registerEvent method was changed to be synchronized and the internal logic changed to use primitive long types and standard arithmentic operations. After this change, the profiler no longer showed the registerEvent as a hotspot in the code when another test run was made.

The updated code was copied to the broker being used for throughput testing, and the TTBT-NA-Qpid-01 test was re-run, with results shown below (again, an average of three runs was taken).

Unknown macro: {table}
Unknown macro: {tr}
Unknown macro: {td}

TTBT-NA-Qpid-01

Unknown macro: {td}

Result

Unknown macro: {td}

Change

Unknown macro: {tr}
Unknown macro: {td}

2.6.0.6

Unknown macro: {td}

113206

Unknown macro: {td}

0.0%

Unknown macro: {tr}
Unknown macro: {td}

2.6.0.7 Latest

Unknown macro: {td}

103698

Unknown macro: {td}

-8.3%

Finally, the single-threaded test was also run against 2.6.0.6 and the latest 2.6.0.7 build, with the following results:

Unknown macro: {table}
Unknown macro: {tr}
Unknown macro: {td}

TTBT-NA-Qpid-01

Unknown macro: {td}

Msgs/Second

Unknown macro: {td}

Change

Unknown macro: {tr}
Unknown macro: {td}

2.6.0.6

Unknown macro: {td}

1714.5

Unknown macro: {td}

0.0%

Unknown macro: {tr}
Unknown macro: {td}

2.6.0.7 Latest

Unknown macro: {td}

1660

Unknown macro: {td}

-3.2%

In order to more fully test this, the script which starts the TTBT-NA test, and the Throughput.sh sctipt itself, were copied, and modified. A series of scripts, STAT-XX.sh, was created, consisting of identical calls to the TTBT-NA test apart from the number of consumers, which varied from 01 to 16 and is represented by XX, and a Statistics.sh script was created which calls each of these in turn.

The STAT throughput tests were run against a standard 2.6.0.6 broker and the 2.6.0.7 build with all statistics generation enabled. It is expected that the difference in throughput should increase as the number of consumers increases. Note that these tests were only run once so are not as accurate as the averaged, multi-run datasets.

2.6.0.6 Throughput Numbers

$ QPID_JAVA_MEM=-Xmx2g ./qpid-server -c ../etc/persistent_config.xml

Unknown macro: {table}
Unknown macro: {tr}
Unknown macro: {td}

Test

Unknown macro: {td}

Thoughput

Unknown macro: {tr}
Unknown macro: {td}

STAT-01

Unknown macro: {td}

57415.543

Unknown macro: {tr}
Unknown macro: {td}

STAT-02

Unknown macro: {td}

92730.82

Unknown macro: {tr}
Unknown macro: {td}

STAT-03

Unknown macro: {td}

105619.415

Unknown macro: {tr}
Unknown macro: {td}

STAT-04

Unknown macro: {td}

116192.78

Unknown macro: {tr}
Unknown macro: {td}

STAT-05

Unknown macro: {td}

119669.87

Unknown macro: {tr}
Unknown macro: {td}

STAT-06

Unknown macro: {td}

122700.195

Unknown macro: {tr}
Unknown macro: {td}

STAT-07

Unknown macro: {td}

111541.725

Unknown macro: {tr}
Unknown macro: {td}

STAT-08

Unknown macro: {td}

126887.405

Unknown macro: {tr}
Unknown macro: {td}

STAT-09

Unknown macro: {td}

110207.38

Unknown macro: {tr}
Unknown macro: {td}

STAT-10

Unknown macro: {td}

122272.66

Unknown macro: {tr}
Unknown macro: {td}

STAT-11

Unknown macro: {td}

123028.33

Unknown macro: {tr}
Unknown macro: {td}

STAT-12

Unknown macro: {td}

127724.23

Unknown macro: {tr}
Unknown macro: {td}

STAT-13

Unknown macro: {td}

114467.72

Unknown macro: {tr}
Unknown macro: {td}

STAT-14

Unknown macro: {td}

128505.86

Unknown macro: {tr}
Unknown macro: {td}

STAT-15

Unknown macro: {td}

116783.16

Unknown macro: {tr}
Unknown macro: {td}

STAT-16

Unknown macro: {td}

127264.14

2.6.0.7 Throughput Numbers

$ QPID_OPTS=-Dqpid.statistics.samplePeriod=5000 QPID_JAVA_MEM=-Xmx2g ./qpid-server -c ../etc/persistent_config.xml

Unknown macro: {table}
Unknown macro: {tr}
Unknown macro: {td}

Test

Unknown macro: {td}

Thoughput

Unknown macro: {td}

Change

Unknown macro: {tr}
Unknown macro: {td}

STAT-01

Unknown macro: {td}

57101.006

Unknown macro: {td}

-0.5%

Unknown macro: {tr}
Unknown macro: {td}

STAT-02

Unknown macro: {td}

83016.304

Unknown macro: {td}

-10.5%

Unknown macro: {tr}
Unknown macro: {td}

STAT-03

Unknown macro: {td}

92533.17

Unknown macro: {td}

-12.4%

Unknown macro: {tr}
Unknown macro: {td}

STAT-04

Unknown macro: {td}

99060.04

Unknown macro: {td}

-14.8%

Unknown macro: {tr}
Unknown macro: {td}

STAT-05

Unknown macro: {td}

103271.49

Unknown macro: {td}

-13.7%

Unknown macro: {tr}
Unknown macro: {td}

STAT-06

Unknown macro: {td}

103339.02

Unknown macro: {td}

-15.8%

Unknown macro: {tr}
Unknown macro: {td}

STAT-07

Unknown macro: {td}

103619.804

Unknown macro: {td}

-7.1%

Unknown macro: {tr}
Unknown macro: {td}

STAT-08

Unknown macro: {td}

105613.2

Unknown macro: {td}

-16.8%

Unknown macro: {tr}
Unknown macro: {td}

STAT-09

Unknown macro: {td}

104110.99

Unknown macro: {td}

-5.5%

Unknown macro: {tr}
Unknown macro: {td}

STAT-10

Unknown macro: {td}

107108.84

Unknown macro: {td}

-12.4%

Unknown macro: {tr}
Unknown macro: {td}

STAT-11

Unknown macro: {td}

106436.25

Unknown macro: {td}

-13.5%

Unknown macro: {tr}
Unknown macro: {td}

STAT-12

Unknown macro: {td}

108537.865

Unknown macro: {td}

-15.0%

Unknown macro: {tr}
Unknown macro: {td}

STAT-13

Unknown macro: {td}

104056.22

Unknown macro: {td}

-9.1%

Unknown macro: {tr}
Unknown macro: {td}

STAT-14

Unknown macro: {td}

108337.49

Unknown macro: {td}

-15.7%

Unknown macro: {tr}
Unknown macro: {td}

STAT-15

Unknown macro: {td}

106900.96

Unknown macro: {td}

-8.5%

Unknown macro: {tr}
Unknown macro: {td}

STAT-16

Unknown macro: {td}

109222.84

Unknown macro: {td}

-14.2%

Although it is difficult to draw any real or meaningful conclusions from these results, note that single consumer performance degradation is negligible.

A useful further test would be to run the 2.6.0.7 broker with only connection-level statistics being generated? This ought to remove the contention for the lock on the counters for virtualhosts and the broker when many consumers are being delivered to.

2.6.0.7 Throughput Numbers

$ QPID_OPTS=-Dqpid.statistics.samplePeriod=5000 QPID_JAVA_MEM=-Xmx2g ./qpid-server -c ../etc/connection_config.xml

Unknown macro: {table}
Unknown macro: {tr}
Unknown macro: {td}

Test

Unknown macro: {td}

Thoughput

Unknown macro: {td}

Change

Unknown macro: {td}

STAT-01

Unknown macro: {td}

55884.613

Unknown macro: {td}

-2.7%

Unknown macro: {tr}
Unknown macro: {td}

STAT-02

Unknown macro: {td}

91857.41

Unknown macro: {td}

-0.9%

Unknown macro: {tr}
Unknown macro: {td}

STAT-07

Unknown macro: {td}

114754.22

Unknown macro: {td}

2.9%

Unknown macro: {tr}
Unknown macro: {td}

STAT-08

Unknown macro: {td}

117917.595

Unknown macro: {td}

-7.1%

Unknown macro: {tr}
Unknown macro: {td}

STAT-09

Unknown macro: {td}

114506.71

Unknown macro: {td}

3.9%

Unknown macro: {tr}
Unknown macro: {td}

STAT-10

Unknown macro: {td}

122330.33

Unknown macro: {td}

0.0%

Unknown macro: {tr}
Unknown macro: {td}

STAT-11

Unknown macro: {td}

118661.09

Unknown macro: {td}

-3.5%

Unknown macro: {tr}
Unknown macro: {td}

STAT-13

Unknown macro: {td}

116632.744

Unknown macro: {td}

1.9%

Unknown macro: {tr}
Unknown macro: {td}

STAT-15

Unknown macro: {td}

118917.79

Unknown macro: {td}

1.8%

Unknown macro: {tr}
Unknown macro: {td}

STAT-16

Unknown macro: {td}

120704.68

Unknown macro: {td}

-5.2%

Unknown macro: {tr}

Conclusion?

It would appear that there is a penalty of a few percent (up to 4%) for a single thread, rising to over 10% when there are many threads. Further testing will be required to determine how the number of connections (i.e. threads) causes the performance to vary. This is, of course, hampered by the fact that the throughput tests exhibit up to 4% variability between the minimum and maximum results for a particular test.

Additionally, it is unclear which of the two mechanisms (AtomicLong versus synchronized registerEvent method) used in the counter is preferable. For the many-threaded, multi-consumer/producer case the AtomicLong seems the better choice, for a single producer/consumer the synchronized registerEvent method is better.

Another point to note is that the actual per-message latency increase is quite small, and it is only when sending large numbers of messages using many connections that the degradation comes into play, and the ratio of messages to data is small, i.e. the data throughput does not concern us, it is purely the number of messages. This means that for most applications there should not be a noticeable difference in performance, with any small change in per message latency being eaten up in noise.

Another point to note is that in these test garbage collection

Further Modifications

This data was generated after modifying the broker code to place statistics recording for message delivery after the message is enqueued at the broker (or at least, after an attempt has been made to do so).

Unknown macro: {table}
Unknown macro: {tr}
Unknown macro: {td}

Test

Unknown macro: {td}

2.6.0.7

Unknown macro: {td}

2.6.0.6

Unknown macro: {td}

2.6.0.7

Unknown macro: {td}

2.6.0.7

Unknown macro: {td}

Disabled

Unknown macro: {tr}
Unknown macro: {td}

PQBT-AA-Qpid-01

Unknown macro: {td}

2777.4448

Unknown macro: {td}

2696.0464

Unknown macro: {td}

2702.624

Unknown macro: {td}

2815.1379

Unknown macro: {td}

2730.4008

Unknown macro: {tr}
Unknown macro: {td}

PQBT-TX-Qpid-01

Unknown macro: {td}

3206.5027

Unknown macro: {td}

3147.2404

Unknown macro: {td}

3179.9607

Unknown macro: {td}

2701.3173

Unknown macro: {td}

3148.375

Unknown macro: {tr}
Unknown macro: {td}

PTBT-AA-Qpid-01

Unknown macro: {td}

4014.191

Unknown macro: {td}

3977.7153

Unknown macro: {td}

3902.3087

Unknown macro: {td}

3983.5536

Unknown macro: {td}

3996.9091

Unknown macro: {tr}
Unknown macro: {td}

PTBT-TX-Qpid-01

Unknown macro: {td}

4023.8338

Unknown macro: {td}

3996.622

Unknown macro: {td}

4079.2136

Unknown macro: {td}

3694.2487

Unknown macro: {td}

4006.0062

Unknown macro: {tr}
Unknown macro: {td}

TQBT-AA-Qpid-01

Unknown macro: {td}

47845.87

Unknown macro: {td}

51156.82

Unknown macro: {td}

52660.16

Unknown macro: {td}

45339.924

Unknown macro: {td}

51316.2

Unknown macro: {tr}
Unknown macro: {td}

TQBT-NA-Qpid-01

Unknown macro: {td}

48345.562

Unknown macro: {td}

47846.085

Unknown macro: {td}

48281.647

Unknown macro: {td}

49123.188

Unknown macro: {td}

47374.016

Unknown macro: {tr}
Unknown macro: {td}

TQBT-TX-Qpid-01

Unknown macro: {td}

6301.3253

Unknown macro: {td}

7262.565

Unknown macro: {td}

7209.0416

Unknown macro: {td}

6326.9844

Unknown macro: {td}

7234.8924

Unknown macro: {tr}
Unknown macro: {td}

TTBT-AA-Qpid-01

Unknown macro: {td}

78419.44

Unknown macro: {td}

89033.62

Unknown macro: {td}

79150.925

Unknown macro: {td}

80203.896

Unknown macro: {td}

90342.84

Unknown macro: {tr}
Unknown macro: {td}

TTBT-NA-Qpid-01

Unknown macro: {td}

97468.78

Unknown macro: {td}

110630.585

Unknown macro: {td}

97623.314

Unknown macro: {td}

98914.57

Unknown macro: {td}

120317.604

Unknown macro: {tr}
Unknown macro: {td}

TTBT-TX-Qpid-01

Unknown macro: {td}

18498.884

Unknown macro: {td}

22650.204

Unknown macro: {td}

21379.168

Unknown macro: {td}

18915.474

Unknown macro: {td}

22702.417

As a cross-check, the tests were repeated using a stock 2.6.0.6 broker, then again (twice actually), with the 2.6.0.7 broker, like the first time, in case there are any environmental factors that are skewing the results. And finally, with the 2.6.0.7 broker, but with statisatics generation disabled.

These results show that the persistent testing has results that are about the same for both versions, whether the statistics are enbled or disabled. This is due to the naturally lower throughput causing less contention and allowing the statistics generation to run unimpeded. The transient tests get worse with statistics generation enabled, due to the much higher message throughput and the fact that may messages are trying to update the counters at the same time.

Latencey

As a general recommendation, I would suggest that statistics generation is only enabled in cases of low message (count, not size) traffic, where the broker is not being 100% ustilised in terms of CPU. Persistant messaging is nescessarily I/O bound, interms of disk access, and this is usually the determining factor in message latency, so adding the statistics counters places no large additional burden, and will not add a large amount to the latency, certainly less that 5% degradation. It is obvious that any additional processing that talkes place during the act of delivery or receipt of a message will inevitably add latency and decrease throughput on the broker. If a broker is already operati ng at peak performance, something that is only possible with transient messages, which are only dependant on the amount of CPU and RAM available, then the enabling of statistics counters will have a greater effect, possibly degrading pewrformance by over 10%.

Tasks

  • Create generic data capture utility for gathering stats on received/delivered messages inc current/peak messages/sec and rolling total messages inc message count and bytes (payload only) stats, inc unit tests. Requires further discussion.
  • Configure stats capture (on/off) for broker (config.xml - new section) and per connection (jmx) and data extract dump interval, and code to dump stats data
  • Expose stats via JMX MBean attributes for query by users (notification - reqs to be confirmed with client), add reset feature via JMX and tests for these features. JMX changes to expose new stats attributes and reset function, inc tests
  • Performance test to measure overhead of stats calculation (see RG email/example class) - separate from perf test harness, small class to measure calc cost only

QPID-2932 Add statistics generation for broker message delivery
QPID-???? Port to trunk

  • No labels