• All stats need a stat level (matches the stat levels in VSD: Common, Expert, Wizard). Here is a list of [GemFire 6.6 Statistics]
    • Need to add a new stat-level property (modeled after log-level; the order of stat-levels would be common, expert, wizard)
  • Need to make the stats properties modifiable at runtime: statistic-sampling-enabled, enable-time-statistics; maybe stat-level, statistic-sample-rate
  • The GemFire Statistics API needs to be more user-friendly and easier to access standard GemFire stats (right now the primary purpose of the user API is to allow users to create custom stat resources)
    • Define API (with constants?) which users can use to programmatically access (Read-Only) statistic resources and stat values
  • Define a Statistics listener to provide monitoring of named stats (allows stats to push to local MBeans and customer listeners)
    • Stat sampler thread should check the registered monitors and fire the notifications immediately after completing its other responsibilities
    • Rate of monitor notifications should be no more frequent than the statistic-sample-rate (possibly introduce statistic-monitor-rate or statistic-monitor-multiple)
      • Option 1: specified in milliseconds (or even seconds) but must reflect multiples of statistic-sample-rate; ie if statistic-sample-rate is 1000 then statistic-monitor-rate might be 5000 or 10000 or other multiple of 1000 (default of 10000 might be reasonable)
      • Option 2: required to be multiples of statistic-sample-rate; ie value of 1 would indicate every sample would check monitors and fire notification; value of 2 would be every other sample; value of 10 would be every ten samples (default of 10 might be reasonable)
    • Provide a way to specify a set of stats by name and resource (allowing stats from different resources to be grouped together)
    • Each set is paired with a monitor type (thus you group 1 or more stats and specify a monitor for the group)
    • Each notification fired provides a map of the named stats to their values for each stat in the group that matches the specified monitor criteria (might be 1 or some or all stats in the group)
    • Support different monitor types (a couple are similar to JMX but remember this is NOT JMX):
      • CounterMonitor: notify when stat value exceeds threshold (allow less-than or greater-than options)
      • GaugeMonitor: notify when stat value is outside specified range (specify both low-threshold and high-threshold)
      • ValueMonitor: notify when stat value matches or differs from specified value or when stat value changes if specified value is null (implementation and usage of latter needs to protect against excess network traffic and burden on the member)
    • Users can use this Statistics API to implement a custom MBean which exposes one or more stats that are not already exposed as Attributes on the 7.0 GemFire product MBeans
  • Consider changing statistic-sample-rate (and statistic-monitor-rate) to use units in seconds instead of milliseconds
  • Bruce: 
    I think we should change any stat that is currently using nanoseconds to use milli or microseconds. Nanosecond figures are too large to deal with and the values are typically in the millisecond range anyway.
  • Darrel: 
    It seems like it would be much easier to use (and self documenting) if these APIs just had a simple get method for each statistic and a factory method on a top level class like GemFireStatistics. So for your example we would have an interface named DistributedLockServiceStatistics with a gettor method on it named getBecomeGrantorRequests which returns an int. To get its value just do this: 

    GemFireStatistics.getDistributedLockServiceStatistics().getBecomeGrantorRequests(); 

    All the resource interfaces could implement a common interface that gives you the common information about them (name, id, ...). GemFireStatistics could have a single method for each singleton resource we have and it could have other methods for types that have multiple instances. For example for a CachePerfStats on a particular region we could have a getCachePerfStats(String regionName).