Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

1.  Backgorund

1.1 Problems

    As IoTDB becomes more and more complex, some system operating metric need to be monitored to improve the operational viability and robustness of the system. IoTDB does not have a relatively complete metric collector to support the collection of some system operating metric now, so it is necessary to design a set of metric acquisition system.

...

    Therefore, we develop a set of metric acquisition interfaces and adapt it to other mature acquisition libraries to achieve the benefits of flexible switching and easy target optimization.

1.2Targets2 Targets

Provides a set of metric collector interfaces.

...

The acquisition system consists of following four parts.

2.1.

...

1 Metrics

Provide tools for collecting metric in different scenarios, including Counter, Gauge, Meter, Histogram, Timer, each with tags.

2.1.

...

2 MetricManager

a. Provides functions such as creating, finding, updating, and deleting metrics.

...

d.Provides its own start and stop methods.

2.1.3

...

MetricReporter

Push the collector's data to other systems, such as Prometheus, JMX, IoTDB, etc.

2.1.4

...

MetricService

Provides metricManager start-up, acquisition, and shutdown functions that can be used in the future after being registered as Iservice.

image2021-7-12_17-12-42.pngImage Added

2.2 Class diagram

image2021-7-14_12-41-48.pngImage Added

2.3 IMetric

IMetric is the collector parent interface.

Code Block
public interface IMetric {}

2.3.1 Counter

Counter is a cumulative counter.

Code Block
public interface Counter extends Metric {
  void inc();

  void inc(long n);

  long count();
}

2.3.2 Gauge

Gauge is a staging device for a value.

Code Block
public interface Gauge extends IMetric {
  long value();

  void set(long value);
}

2.3.3 Rate

Calculate the rate and average rate of a value over the last 1,5,15 minutes.

Code Block
public interface Rate extends IMetric {
	long getCount();

	double getOneMinuteRate();

	double getMeanRate();

	double getFiveMinuteRate();

	double getFifteenMinuteRate();

	void mark();

	void mark(long n);
}

2.3.4 Histogram and HistogramSnapshot

Snapshot is a class that hosts data, providing a percentile ratio and a list of numbers that are counted by interval cut-off.

Code Block
public interface Histogram extends IMetric {
 void update(int value);

 void update(long value);

 long count();

 HistogramSnapshot takeSnapshot();
}



public interface HistogramSnapshot {

  public abstract double getValue(double quantile);

  public abstract long[] getValues();

  public abstract int size();

  public double getMedian();

  public abstract long getMax();

  public abstract double getMean();

  public abstract long getMin();

  public abstract void dump(OutputStream output);
}

2.3.5 Timer

Timer records the histogram of time and the rate of research (Meter and Histogram).

Code Block
public interface Timer extends IMetric {
  void update(long duration, TimeUnit unit);

  default void updateMillis(long durationMillis) {
    update(durationMillis, TimeUnit.NANOSECONDS);
  }

  default void updateMicros(long durationMicros) {
    update(durationMicros, TimeUnit.MICROSECONDS);
  }

  default void updateNanos(long durationNanos) {
    update(durationNanos, TimeUnit.NANOSECONDS);
  }

  HistogramSnapshot takeSnapshot();

  Rate getImmutableRate();
}


2.4 MetricManager

  MetricManager provides interfaces for new, deleted, modified, and querying function for MetricReporter andMetric, as well as switches for data acquisition that are exposed.

Code Block
public interface MetricManager {

  Counter getOrCreateCounter(String metric, String... tags);

  Gauge getOrCreatGauge(String metric, String... tags);

  Rate getOrCreatRate(String metric, String... tags);

  Histogram getOrCreateHistogram(String metric, String... tags);

  Timer getOrCreateTimer(String metric, String... tags);

  void count(int delta, String metric, String... tags);

  void count(long delta, String metric, String... tags);

  void gauge(int value, String metric, String... tags);

  void gauge(long value, String metric, String... tags);

  void rate(int value, String metric, String... tags);

  void rate(long value, String metric, String... tags);

  void histogram(int value, String metric, String... tags);

  void histogram(long value, String metric, String... tags);

  void timer(long delta, TimeUnit timeUnit, String metric, String... tags);

  void removeCounter(String metric, String... tags);

  void removeGauge(String metric, String... tags);

  void removeRate(String metric, String... tags);

  void removeHistogram(String metric, String... tags);

  void removeTimer(String metric, String... tags);

  List<String[]> getAllMetricKeys();

  Map<String[], Counter> getAllCounters();

  Map<String[], Gauge> getAllGauges();

  Map<String[], Rate> getAllRates();

  Map<String[], Histogram> getAllHistograms();

  Map<String[], Timer> getAllTimers();

  boolean isEnable();

  void enableKnownMetric(KnownMetric metric);

  boolean init();

  boolean stop();

  boolean startReporter(String reporterName);

  boolean stopReporter(String reporterName);

  void setReporter(MetricReporter metricReporter);

  String getName();
}

2.5 MetricReporter

MetricReporter is a data push interface.

Code Block
public interface MetricReporter {
  boolean start();

  boolean start(String reporter);

  boolean stop();

  boolean stop(String reporter);

  String getName();
}

3. Test Report

We implemented the monitoring framework using Dropwizard and Micromometer respectively, and tested the results as follows:

3.1 Test Environment

Processor:Inter(R) Core(TM) i7-1065G7 CPU

RAM: 32G

3.2 Test Metrics

  We use a single thread to create counter and run the test cases separately in two frameworks of Microsoometer and Dropwizard. The test metrics as follows:

  1. memory : Memory usage in MB.
  2. create : The time required to create, in ms.
  3. searchInorder : The time required for the sequential query, in ms.
  4. searchDisorder : The time required for random queries in ms.

3.3 Test parameters

  1. metric : test metric 
  2. name : The name of the test metric, unify to one length.
  3. tag : The tag of the test metric, unify to one length.
  4. metricNumberTotal:The number of metrics tested.
  5. tagSingleNumber:Number of tags of the test metric.
  6. tagTotalNumber:The number of tag pools, the default is 1000, all tags are taken out of the tag pool.
  7. searchNumber:The number of queries, the default is 1000000.
  8. loop:The number of query loops, the default is 10.

3.4 Test Result

Image Added

3.5 Test Script

3.5.1 Test

Test holds a MetricManager and is responsible for completing specific testing.

Code Block
public class Test {
    private Integer metricNumberTotal;
    private Integer metricNameNumberLength;
    private Integer tagTotalNumber;
    private Integer tagSingleNumber;
    private Integer searchNumber;
    private String[] TAGS;
    private static Random random = new Random(43);
    private static MetricManager metricManager = MetricService.getMetricManager();
    private static Map<String, String[]> name2Tags = new HashMap<>();

    /**
     *
     * @param metricNumber
     * @param tagTotalNumber
     * @param tagSingleNumber
     * @param searchNumber
     */
    Test(Integer metricNumber, Integer tagTotalNumber, Integer tagSingleNumber
            , Integer searchNumber){
        this.metricNumberTotal = metricNumber;
        this.metricNameNumberLength = String.valueOf(metricNumberTotal).length();
        this.tagTotalNumber = tagTotalNumber;
        this.tagSingleNumber = tagSingleNumber;
        this.searchNumber = searchNumber;
        TAGS = new String[tagTotalNumber];
        for(int i = 0; i < tagTotalNumber; i++){
            TAGS[i] = initTag(i);
        }
    }

    /**
     * generate tags for metric
     * @param number
     * @return
     */
    private String initTag(Integer number){
        StringBuilder stringBuilder = new StringBuilder(String.valueOf(number));
        while(stringBuilder.length() < 3){
            stringBuilder.insert(0, '0');
        }
        stringBuilder.insert(0, "Tag");
        return stringBuilder.toString();
    }

    /**
     * generate name for metric
     * @param number
     * @return
     */
    private String generateName(Integer number){
        StringBuilder stringBuilder = new StringBuilder(String.valueOf(number));
        Integer length = String.valueOf(metricNumberTotal).length();
        while(stringBuilder.length() < metricNameNumberLength){
            stringBuilder.insert(0, '0');
        }
        stringBuilder.insert(0, "counter");
        return stringBuilder.toString();
    }

    /**
     * generate tags of a metric
     * @return
     */
    private String[] generateTags(){
        List<Integer> targets = new ArrayList<>();
        while(targets.size() < tagSingleNumber){
            Integer target = generateRandom(tagTotalNumber);
            if(!targets.contains(target)){
                targets.add(target);
            }
        }
        String[] tags = new String[tagSingleNumber];
        for(int i = 0; i < tagSingleNumber; i++){
            tags[i] = TAGS[targets.get(i)];
        }
        return tags;
    }

    /**
     * generate next int
     * @param max
     * @return
     */
    private Integer generateRandom(Integer max){
        return random.nextInt(max);
    }

    /**
     * create metric in order
     * @return
     */
    public long createMetricInorder(){
        long total = 0;
        for(int i = 0; i < metricNumberTotal; i++){
            String name = generateName(i);
            String[] tags = generateTags();
            long start = System.currentTimeMillis();
            metricManager.getOrCreateCounter(name, tags);
            long stop = System.currentTimeMillis();
            total += (stop - start);
            name2Tags.put(name, tags);
        }
        return total;
    }

    /**
     * search metric in order
     * @return
     */
    public long searchMetricInorder(){
        long total = 0;
        for(int i = 0; i < searchNumber; i++){
            total += searchOne(i);
        }
        return total;
    }

    /**
     * search metric in random way
     * @return
     */
    public long searchMetricDisorder(){
        long total = 0;
        for(int i = 0; i < searchNumber; i++){
            total += searchOne(generateRandom(metricNumberTotal - 1));
        }
        return total;
    }

    private long searchOne(Integer target) {
        String name = generateName(target % metricNumberTotal);
        String[] tags = name2Tags.get(name);
        long start = System.currentTimeMillis();
        metricManager.getOrCreateCounter(name, tags);
        long stop = System.currentTimeMillis();
        return stop - start;
    }

    @Override
    public String toString() {
        return metricNumberTotal +
                "," + tagTotalNumber +
                "," + tagSingleNumber +
                "," + searchNumber;
    }

    public void stop(){
        name2Tags.clear();
        metricManager.stop();
    }
}

3.5.2 TestPlan

TestPlan sets up specific test plans to complete testing and statistics.

Code Block
public class TestPlan {
    private static final Integer[] TAG_NUMBERS = {2, 4, 6, 8, 10};
    private static final Integer[] METRIC_NUMBERS = {1000, 10000, 50000, 100000, 500000, 1000000};
    private static final Integer LOOP = 10;
    private static final Integer tagTotalNumber = 1000;
    private static final Integer searchNumber = 100000;

    private static void test(Integer metric, Integer tag){
        Long[] times = {0L, 0L, 0L};
        Test test = new Test(metric, tagTotalNumber, tag, searchNumber);
        times[0] += test.createMetricInorder();
        for(int i = 0; i < LOOP; i ++){
            times[1] += test.searchMetricInorder();
            times[2] += test.searchMetricDisorder();
        }
        test.stop();
        System.out.println(metric + "," + tagTotalNumber + "," + tag + "," +
                searchNumber + "," + (times[0]) + "," +
                (times[1] * 1.0 / LOOP) + "," + (times[2] * 1.0 / LOOP));
    }

    public static void main(String[] args) {
        System.setProperty("METRIC_CONF", "path of yml");
        for(Integer metric: METRIC_NUMBERS){
            for(Integer tag: TAG_NUMBERS){
                test(metric, tag);
            }
        }
    }
}

4. DropWizard Unit Test Results

To ensure the reliability of the features, we unit tested DrowizardMetricManager, covering the main function. To re-emerge the test, you need to modify the yml profile address in the init() method (the profile is stored under the conf of the statistical directory). The final result of the test is shown in the figure below.

Image Added

5. Dropwizard connects to Prometheus via PushGateway

5.1 Experimental process

This test was done using the PrometheusRunTest script, which is followed.

Code Block
public class PrometheusRunTest {
  public MetricManager metricManager = MetricService.getMetricManager();

  public static void main(String[] args) throws InterruptedException {
    System.setProperty("line.separator", "\n");
    System.setProperty("METRIC_CONF", "path of yml");
    PrometheusRunTest prometheusRunTest = new PrometheusRunTest();
    Counter counter = prometheusRunTest.metricManager.getOrCreateCounter("counter");
    while (true) {
      counter.inc();
      TimeUnit.SECONDS.sleep(1);
    }
  }
}

The configuration of the parameters for Prometheus is completed in the configuration file (yml file) used by the script, as follows:

prometheusReporterConfig:
    prometheusExporterUrl: http://localhost 
    prometheusExporterPort: 9091 

Through this script, dropwizard monitors a counter that increases by 1 every 1 second, while updates to all metrics are pushed to the specified pushgateway waiting for Prometheus to use.

5.2 Experimental environment

Grafana runs port 8081

Prometheus runs port 9090

PushGatewayruns port 9091

5.3 Expermintal result

Image Added