Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

Status


Current stateProposed

Motivation

The JIRA and discussions have the various requirements captured for the ability to gather metrics of during the gateway’s processing pipeline and request/response flow. The initial attempt to satisfy some of the requirements is to provide a very simple abstraction to hide the details of the dropwizard metrics library and expose some of the basic and most requested metrics. As a side note, all the requirements in KNOX-643 are not going to be captured by this KIP and the associated work. The following requirements are going to be addressed:
  1. Ability to get time taken for request/reponse coming from the client and the frequency at various time intervals at the service level
  2. Ability the get the time and frequency of request/responses to the backend service component.
  3. The number of open connections to the backend service component.
  4. API to add/extend the metrics capabilities.
  5. Ability to report the metrics to reporting engines like Graphite and Ambari Metrics Service.


Design

The dropwizard metrics library was selected after some comparative analysis of similar libraries that allow for instrumenting code so that metrics can be gathered at runtime. This document is leaving out that analysis and asserting the result that the dropwizard metrics library was essentially the easiest to use API that provided the functionality we were looking for, had the appropriate licensing and was most frequently used in other Apache projects.
The desire however as always is to provide a layer of abstraction, leaving the possibility open to future changes or adoption of other libraries. The pattern used in the API design is of being able to provide instrumented versions of a class or an interface and not yet exposing the detailed measuring instruments like Guages, meters etc.
The MetricsService API therefore looks like this:

 

Code Block
public interface MetricsService extends Service {

  <T> T getInstrumented(T instanceClass);

  <T> T getInstrumented(Class<T> clazz);

}

 

public interface MetricsService extends Service {
  <TgetInstrumented(instanceClass);
  <TgetInstrumented(Class<T> clazz);
}

The MetricsService is implemented as a Gateway Service, the details of which can be found in the dev guide. It is therefore accessible to all Topology deployments so that per topology metrics can be done and of course aggregation can be done as well at the gateway level.

Plugging in a new Instrumented Class



Reporting

public interface MetricsReporter {
  String getName


 
Code Block
public interface MetricsReporter {

  String getName();

  void init


  void init(GatewayConfig config)
 throws MetricsReporterException;
  void start
 throws MetricsReporterException;

  void start(MetricsContext metricsContext)
 throws MetricsReporterException;
  void stop() throws MetricsReporterException;
  boolean isEnabled
 throws MetricsReporterException;

  void stop() throws MetricsReporterException;

  boolean isEnabled();


}


Plugging in a new Reporter



Graphite+Grafana


Config



Future work

From the list of requirements one of the glaring holes is that of getting more metrics out of Knox’s Shiro/LDAP provider. I believe this requirement comes from more of a debugging mindset when faced with issues in the field, but may have a broader appeal so need some validation.
The other main task which is possibly a near future item is to provide for additional reporters. Specifically of interest would be a reporter that sends data to the Ambari Metrics Service. This would provide a convenient solution for viewing the metrics when in a  hadoop deployment that has Ambari available.