You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

Discussion thread
Vote threadTBD
ISSUEhttps://github.com/apache/incubator-paimon/issues/742
ReleaseTBD


Motivation

Currently Paimon runs without monitoring and metrics out of box. In the production environment, users need to know how their Paimon Table behaves like what is the commit duration, how many files each commit added or deleted, the status of compaction operation, the duration of scan query, lag of streaming reading and so on. So we need to introduce a metrics system and more metrics for Paimon.  

Public Interfaces

Measurable metric interfaces

Metric

Metric is the base measurable metric interface, which indicate a class is a metric.

public interface Metric {}

Gauge

Gauge is a type of metric interface provides a value of any type at a point in time.

/** Gauge calculates a specific value at a point in time. */
@Public
public interface Gauge<T> {

    /**
     * Calculates and returns the measured value.
     *
     * @return calculated value
     */
    T getValue();
}

Counter

Counter is a type of metric interface which is used to count values by incrementing and decrementing.

/** A Counter is a metric measured by incrementing and decrementing. */
@Public
public interface Counter extends Metric {

    /** Increment the current count by 1. */
    void inc();

    /**
     * Increment the current count by the given value.
     *
     * @param n value to increment the current count by
     */
    void inc(long n);

    /** Decrement the current count by 1. */
    void dec();

    /**
     * Decrement the current count by the given value.
     *
     * @param n value to decrement the current count by
     */
    void dec(long n);

    /**
     * Returns the current count.
     *
     * @return current count
     */
    long getCount();
}

Metrics

Class Metrics  is the core of metrics system, there are `MetricRegistry` and `MetricsReporter` container in it. When the Metrics  instance is initiating, the MetricRegistry  is instantiated and metrics reporters are started. 

Metrics reporters are configurable, users can use custom reporters, Paimon will provide a default metrics reporter of JMX metrics reporter. 

public class Metrics {

    /** The registry that holds the metrics. */
 	private final MetricRegistry registry;

    /** The metrics reporters container. */
  	private final List<MetricsReporter> reporters;

	/** Register metrics to MetricRegistry. 
 		@param name The name of metric.
		@param metric The metric to register.	
	*/
	public void registerMetrics(String name, Metric metric) {}
}

MetricRegistry

MetricRegistry is a class responsible for metrics registering, there is a metrics container in it. It provides register method for each type of measurable metric, registering metrics will put metrics to the metrics container.

public class MetricRegistry {

	/** Map of gauge metrics. */
    private final Map<String, Gauge<?>> gauges = new HashMap<>();

    /** Map of counter metrics. */
    private final Map<String, Counter> counters = new HashMap<>();

	/** Register gauge metric. */
 	public void gauge(String name, Gauge gauge) {}

	/** Register counter metric. */
	public void counter(String name, Counter counter) {}
}

MetricsReporter

MetricsReporter  is used to report metrics to external backend, Paimon will implement an out of box reporter as JMX `MetricsReporter`.

public interface MetricsReporter {
	/** Configure reporter after instantiating it.*/
     void open();

    /** Closes this reporter. */
    void close();

	/** Report the current measurements. This method is called periodically by the Metrics. */
	void report();
}

Proposed Changes

Architecture

Metrics Registering

Take CommitMetrics as example, the CommitMetrics will be instantiated by FileStoreCommitImpl, then commit related metrics will be registered by MetricRegistry in singleton Metrics

The Metrics has instance of MetricRegistry and MetricsReporters set. MetricRegistry maintains metrics map containers. Metrics registering is a process of putting metrics instances into the metric (gauge, counter) map container. 

Update metrics value

The CommitMetrics values will be updated around commit() operation, for example the commit starting time will be recorded before commit operation, CommitDuration value will be recorded after commit completing.

CompactionMetrics values will be updated around compaction operation, and ScanMetrics will be recorded through the scan planning process.

Report metrics to external backend

Each reporter instance has an timer task fetching metrics from the metrics containers periodically and report them out to the external backends. Paimon will has a default reporter backend with JMX, users can define their own MetricsReporter by implement MetricsReporter  interface. Here we can introduce a core option metrics.reporter  to specify a metrics backend.

Metrics list

We introduce CommitMetrics , CompactionMetrics , ScanMetrics as metrics set to measure the stats of Paimon table committing, compaction and scanning.

Common metrics

CommitMetrics

public class CommitMetrics {
	private Metrics metrics;
	private final String COMMIT_DURATION_METRIC = "commitDuration";
	...
	private void registerCommitMetrics(Metrics metrics) {
		metrics.gauge(COMMIT_DURATION_METRIC, new CommitDurationTimer());
		...
	}
	...
}

CommitMetrics list includes commit duration, counter of files / records etc.

Metric Name

Description

Type

Unit

Update at

commitDuration

Commit 

Gauge

Ms

Timer starts before commit starting, update commit duration after commit finished

numTableFilesAdded

Number of added table files in this commit

Counter

Number

Collecting changes from committables

numTableFilesDeleted

Number of deleted table files in this commit

Counter

Number

Collecting changes from committables

numTableFilesAppended

Number of appended table files in this commit

Counter

Number

Collecting changes from committables

numTableFilesCompated

Number of compacted table files in this commit

Counter

Number

Collecting changes from committables

numChangelogFilesAppended

Number of appended changelog files in this commit

Counter

Number

Collecting changes from committables

numChangelogFileCompacted

Number of compacted changelog files in this commit

Counter

Number

Collecting changes from committables

numSnapshots

Number of snapshot files generated in this commit

Counter

Number

Trying to commit

numTotalRecords

Total records count in this commit

Counter

Number

Preparing snapshot file

numDeltaRecords

Delta records count in this commit

Counter

Number

Preparing snapshot file

numChangelogRecords

Changelog records count in this commit

Counter

Number

Preparing snapshot file

numPartitionsWritten

Number of partitions written in this commit

Counter

Number

Trying to commit

numBucketsWritten

Number of buckets written in this commit

Counter

Number

Trying to commit

ScanMetrics

public class ScanMetrics {
	private Metrics metrics;
	private final String SCAN_FILES_METRIC = "scanFiles";
	...
	private void registerScanMetrics(Metrics metrics) {
		metrics.counter(SCAN_FILES_METRIC, new ScanFilesCounter());
		...
	}
	...
}

ScanMetrics list includes duration, data files and manifest files counter.

Metric Name

Description

Type

Unit

Update at

scanDuration

Scan Duration

Gauge

Ms

Timer starts before planning starts, update after planning finished

numTotalManifests

Number of scanned manifests files

Counter

Number

Planning

numSkippedManifests

Number of skipped manifests files

Counter

Number

Planning

numResultTableFiles

Number of result table files

Counter

Number

Planning

CompactionMetrics

public class CompactionMetrics {
	private Metrics metrics;
	private final String COMPACTED_FILES_METRIC = "compactedFiles";
	...
	private void registerCompactionMetrics(Metrics metrics) {
		metrics.counter(COMPACTED_FILES_METRIC, new CompactedFilesCounter());
		...
	}
	...
}

CompactionMetrics list includes duration, and counter of files, sorted runs etc.

Metric Name

Description

Type

Unit

Update at

compactionDuration

Compaction duration

Gauge

Ms

Timer starts before compaction, update after compaction finished

numFilesCompactedBefore

Number of deleted files in compaction

Counter

Number

Triggering compaction

numFilesCompactedAfter

Number of added files in compaction

Counter

Number

Triggering compaction

numChangelogFilesCompacted

Number of changelog files compacted

Counter

Number

Triggering compaction

numSortedRuns

Number of sorted runs

Counter

Number

Triggering compaction

numLevel0Files

Number of files at level 0

Counter

Number

Triggering compaction

Flink connector metrics

Implement important source metrics in FLIP-33.

Metric name

Description

Type

Unit

Update at

numBytesIn

The total number of input bytes since the source started.

Counter

Number

FileStoreSourceSplitReader fetch finished.

pendingRecords

The number of records that have not been fetched by the source.

Gauge

Number


Compatibility, Deprecation, and Migration Plan

There are no changes to the public interface and no impact to existing users.

Test Plan


Rejected Alternatives

  • No labels