You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

1.  Backgorund

1.1 Problems

    As IoTDB becomes more and more complex, some system operating metric need to be monitored to improve the operational viability and robustness of the system. IoTDB does not have a relatively complete metric collector to support the collection of some system operating metric now, so it is necessary to design a set of metric acquisition system.

    There are some widely used metric acquisition library in the open source community, such as Dropwizard metrics, micrometer, and dubbo metrics, that can be considered for adaptation. Considering that IoTDB is a real-time online system, according to the experience of previously collected metrics leading to a significant drop in performance, these metrics library may not be able to meet the needs of IoTDB in terms of performance.

    Therefore, we develop a set of metric acquisition interfaces and adapt it to other mature acquisition libraries to achieve the benefits of flexible switching and easy target optimization.

1.2 Targets

Provides a set of metric collector interfaces.

A set of adaptation implementations based on micrometer.

A set of adaptation implementations based on dropwizard.

Test metric creation and query performance on miceometer and dropwizard.

2. Overall Design

2.1 Acquisition system

The acquisition system consists of following four parts.

2.1.1 Metrics

Provide tools for collecting metric in different scenarios, including Counter, Gauge, Meter, Histogram, Timer, each with tags.

2.1.2 MetricManager

a. Provides functions such as creating, finding, updating, and deleting metrics.

b. Provides management of reporter, including starting and stopping reporter.

c. Provides the ability to introduce default metrics(Known Metric).

d.Provides its own start and stop methods.

2.1.3 MetricReporter

Push the collector's data to other systems, such as Prometheus, JMX, IoTDB, etc.

2.1.4 MetricService

Provides metricManager start-up, acquisition, and shutdown functions that can be used in the future after being registered as Iservice.

image2021-7-12_17-12-42.png

2.2 Class diagram

image2021-7-14_12-41-48.png

2.3 IMetric

IMetric is the collector parent interface.

public interface IMetric {}

2.3.1 Counter

Counter is a cumulative counter.

public interface Counter extends Metric {
  void inc();

  void inc(long n);

  long count();
}

2.3.2 Gauge

Gauge is a staging device for a value.

public interface Gauge extends IMetric {
  long value();

  void set(long value);
}

2.3.3 Rate

Calculate the rate and average rate of a value over the last 1,5,15 minutes.

public interface Rate extends IMetric {
	long getCount();

	double getOneMinuteRate();

	double getMeanRate();

	double getFiveMinuteRate();

	double getFifteenMinuteRate();

	void mark();

	void mark(long n);
}

2.3.4 Histogram and HistogramSnapshot

Snapshot is a class that hosts data, providing a percentile ratio and a list of numbers that are counted by interval cut-off.

public interface Histogram extends IMetric {
 void update(int value);

 void update(long value);

 long count();

 HistogramSnapshot takeSnapshot();
}



public interface HistogramSnapshot {

  public abstract double getValue(double quantile);

  public abstract long[] getValues();

  public abstract int size();

  public double getMedian();

  public abstract long getMax();

  public abstract double getMean();

  public abstract long getMin();

  public abstract void dump(OutputStream output);
}

2.3.5 Timer

Timer records the histogram of time and the rate of research (Meter and Histogram).

public interface Timer extends IMetric {
  void update(long duration, TimeUnit unit);

  default void updateMillis(long durationMillis) {
    update(durationMillis, TimeUnit.NANOSECONDS);
  }

  default void updateMicros(long durationMicros) {
    update(durationMicros, TimeUnit.MICROSECONDS);
  }

  default void updateNanos(long durationNanos) {
    update(durationNanos, TimeUnit.NANOSECONDS);
  }

  HistogramSnapshot takeSnapshot();

  Rate getImmutableRate();
}


2.4 MetricManager

  MetricManager provides interfaces for new, deleted, modified, and querying function for MetricReporter andMetric, as well as switches for data acquisition that are exposed.

public interface MetricManager {

  Counter getOrCreateCounter(String metric, String... tags);

  Gauge getOrCreatGauge(String metric, String... tags);

  Rate getOrCreatRate(String metric, String... tags);

  Histogram getOrCreateHistogram(String metric, String... tags);

  Timer getOrCreateTimer(String metric, String... tags);

  void count(int delta, String metric, String... tags);

  void count(long delta, String metric, String... tags);

  void gauge(int value, String metric, String... tags);

  void gauge(long value, String metric, String... tags);

  void rate(int value, String metric, String... tags);

  void rate(long value, String metric, String... tags);

  void histogram(int value, String metric, String... tags);

  void histogram(long value, String metric, String... tags);

  void timer(long delta, TimeUnit timeUnit, String metric, String... tags);

  void removeCounter(String metric, String... tags);

  void removeGauge(String metric, String... tags);

  void removeRate(String metric, String... tags);

  void removeHistogram(String metric, String... tags);

  void removeTimer(String metric, String... tags);

  List<String[]> getAllMetricKeys();

  Map<String[], Counter> getAllCounters();

  Map<String[], Gauge> getAllGauges();

  Map<String[], Rate> getAllRates();

  Map<String[], Histogram> getAllHistograms();

  Map<String[], Timer> getAllTimers();

  boolean isEnable();

  void enableKnownMetric(KnownMetric metric);

  boolean init();

  boolean stop();

  boolean startReporter(String reporterName);

  boolean stopReporter(String reporterName);

  void setReporter(MetricReporter metricReporter);

  String getName();
}

2.5 MetricReporter

MetricReporter is a data push interface.

public interface MetricReporter {
  boolean start();

  boolean start(String reporter);

  boolean stop();

  boolean stop(String reporter);

  String getName();
}

3. Test Report

We implemented the monitoring framework using Dropwizard and Micromometer respectively, and tested the results as follows:

3.1 Test Environment

Processor:Inter(R) Core(TM) i7-1065G7 CPU

RAM: 32G

3.2 Test Metrics

  We use a single thread to create counter and run the test cases separately in two frameworks of Microsoometer and Dropwizard. The test metrics as follows:

  1. memory : Memory usage in MB.
  2. create : The time required to create, in ms.
  3. searchInorder : The time required for the sequential query, in ms.
  4. searchDisorder : The time required for random queries in ms.

3.3 Test parameters

  1. metric : test metric 
  2. name : The name of the test metric, unify to one length.
  3. tag : The tag of the test metric, unify to one length.
  4. metricNumberTotal:The number of metrics tested.
  5. tagSingleNumber:Number of tags of the test metric.
  6. tagTotalNumber:The number of tag pools, the default is 1000, all tags are taken out of the tag pool.
  7. searchNumber:The number of queries, the default is 1000000.
  8. loop:The number of query loops, the default is 10.

3.4 Test Result

3.5 Test Script

3.5.1 Test

Test holds a MetricManager and is responsible for completing specific testing.

public class Test {
    private Integer metricNumberTotal;
    private Integer metricNameNumberLength;
    private Integer tagTotalNumber;
    private Integer tagSingleNumber;
    private Integer searchNumber;
    private String[] TAGS;
    private static Random random = new Random(43);
    private static MetricManager metricManager = MetricService.getMetricManager();
    private static Map<String, String[]> name2Tags = new HashMap<>();

    /**
     *
     * @param metricNumber
     * @param tagTotalNumber
     * @param tagSingleNumber
     * @param searchNumber
     */
    Test(Integer metricNumber, Integer tagTotalNumber, Integer tagSingleNumber
            , Integer searchNumber){
        this.metricNumberTotal = metricNumber;
        this.metricNameNumberLength = String.valueOf(metricNumberTotal).length();
        this.tagTotalNumber = tagTotalNumber;
        this.tagSingleNumber = tagSingleNumber;
        this.searchNumber = searchNumber;
        TAGS = new String[tagTotalNumber];
        for(int i = 0; i < tagTotalNumber; i++){
            TAGS[i] = initTag(i);
        }
    }

    /**
     * generate tags for metric
     * @param number
     * @return
     */
    private String initTag(Integer number){
        StringBuilder stringBuilder = new StringBuilder(String.valueOf(number));
        while(stringBuilder.length() < 3){
            stringBuilder.insert(0, '0');
        }
        stringBuilder.insert(0, "Tag");
        return stringBuilder.toString();
    }

    /**
     * generate name for metric
     * @param number
     * @return
     */
    private String generateName(Integer number){
        StringBuilder stringBuilder = new StringBuilder(String.valueOf(number));
        Integer length = String.valueOf(metricNumberTotal).length();
        while(stringBuilder.length() < metricNameNumberLength){
            stringBuilder.insert(0, '0');
        }
        stringBuilder.insert(0, "counter");
        return stringBuilder.toString();
    }

    /**
     * generate tags of a metric
     * @return
     */
    private String[] generateTags(){
        List<Integer> targets = new ArrayList<>();
        while(targets.size() < tagSingleNumber){
            Integer target = generateRandom(tagTotalNumber);
            if(!targets.contains(target)){
                targets.add(target);
            }
        }
        String[] tags = new String[tagSingleNumber];
        for(int i = 0; i < tagSingleNumber; i++){
            tags[i] = TAGS[targets.get(i)];
        }
        return tags;
    }

    /**
     * generate next int
     * @param max
     * @return
     */
    private Integer generateRandom(Integer max){
        return random.nextInt(max);
    }

    /**
     * create metric in order
     * @return
     */
    public long createMetricInorder(){
        long total = 0;
        for(int i = 0; i < metricNumberTotal; i++){
            String name = generateName(i);
            String[] tags = generateTags();
            long start = System.currentTimeMillis();
            metricManager.getOrCreateCounter(name, tags);
            long stop = System.currentTimeMillis();
            total += (stop - start);
            name2Tags.put(name, tags);
        }
        return total;
    }

    /**
     * search metric in order
     * @return
     */
    public long searchMetricInorder(){
        long total = 0;
        for(int i = 0; i < searchNumber; i++){
            total += searchOne(i);
        }
        return total;
    }

    /**
     * search metric in random way
     * @return
     */
    public long searchMetricDisorder(){
        long total = 0;
        for(int i = 0; i < searchNumber; i++){
            total += searchOne(generateRandom(metricNumberTotal - 1));
        }
        return total;
    }

    private long searchOne(Integer target) {
        String name = generateName(target % metricNumberTotal);
        String[] tags = name2Tags.get(name);
        long start = System.currentTimeMillis();
        metricManager.getOrCreateCounter(name, tags);
        long stop = System.currentTimeMillis();
        return stop - start;
    }

    @Override
    public String toString() {
        return metricNumberTotal +
                "," + tagTotalNumber +
                "," + tagSingleNumber +
                "," + searchNumber;
    }

    public void stop(){
        name2Tags.clear();
        metricManager.stop();
    }
}

3.5.2 TestPlan

TestPlan sets up specific test plans to complete testing and statistics.

public class TestPlan {
    private static final Integer[] TAG_NUMBERS = {2, 4, 6, 8, 10};
    private static final Integer[] METRIC_NUMBERS = {1000, 10000, 50000, 100000, 500000, 1000000};
    private static final Integer LOOP = 10;
    private static final Integer tagTotalNumber = 1000;
    private static final Integer searchNumber = 100000;

    private static void test(Integer metric, Integer tag){
        Long[] times = {0L, 0L, 0L};
        Test test = new Test(metric, tagTotalNumber, tag, searchNumber);
        times[0] += test.createMetricInorder();
        for(int i = 0; i < LOOP; i ++){
            times[1] += test.searchMetricInorder();
            times[2] += test.searchMetricDisorder();
        }
        test.stop();
        System.out.println(metric + "," + tagTotalNumber + "," + tag + "," +
                searchNumber + "," + (times[0]) + "," +
                (times[1] * 1.0 / LOOP) + "," + (times[2] * 1.0 / LOOP));
    }

    public static void main(String[] args) {
        System.setProperty("METRIC_CONF", "path of yml");
        for(Integer metric: METRIC_NUMBERS){
            for(Integer tag: TAG_NUMBERS){
                test(metric, tag);
            }
        }
    }
}

4. DropWizard Unit Test Results

To ensure the reliability of the features, we unit tested DrowizardMetricManager, covering the main function. To re-emerge the test, you need to modify the yml profile address in the init() method (the profile is stored under the conf of the statistical directory). The final result of the test is shown in the figure below.

5. Dropwizard connects to Prometheus via PushGateway

5.1 Experimental process

This test was done using the PrometheusRunTest script, which is followed.

public class PrometheusRunTest {
  public MetricManager metricManager = MetricService.getMetricManager();

  public static void main(String[] args) throws InterruptedException {
    System.setProperty("line.separator", "\n");
    System.setProperty("METRIC_CONF", "path of yml");
    PrometheusRunTest prometheusRunTest = new PrometheusRunTest();
    Counter counter = prometheusRunTest.metricManager.getOrCreateCounter("counter");
    while (true) {
      counter.inc();
      TimeUnit.SECONDS.sleep(1);
    }
  }
}

The configuration of the parameters for Prometheus is completed in the configuration file (yml file) used by the script, as follows:

prometheusReporterConfig:
    prometheusExporterUrl: http://localhost 
    prometheusExporterPort: 9091 

Through this script, dropwizard monitors a counter that increases by 1 every 1 second, while updates to all metrics are pushed to the specified pushgateway waiting for Prometheus to use.

5.2 Experimental environment

Grafana runs port 8081

Prometheus runs port 9090

PushGatewayruns port 9091

5.3 Expermintal result


  • No labels