You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

As a monitoring platform, eagle not only responsible for monitoring cluster/node healthy, but also for monitoring apps(jobs) running on the cluster

Following are some common job monitoring user cases on hadoop platform:

1) Job security monitoring: If a job has malicious data operation like access confidential data or delete larges amounts of data

2) Job performance: Is a job run slower this time compared with its historical running? Does the job has data skew issue leading to one task of the job run much slower that other tasks?

 

To meet the above requirements, we design the eagle storm running job spout, which first support job security monitoring user case

The "running" in running job spout doesn't mean we only monitoring running job, here "running" means "realtime", we also collect completed job information if we miss catching them before they finished due to issue like storm worker crash

Also we use zookeeper to store already processed job info list, along with storm ACK mechanism, the running job spout can delivery at-least-once semantic  

 

Eagle running job spout collect the following data, following is the running job spout work flow

1) Running/Completed Job List

2) Job Detail Info

3) Job Configuration Info

4) Job Counters

 

Running Job Spouts Design

Following are some interfaces

ResourceFetcher
public interface ResourceFetcher {

   List<Object> getResource(JobConstants.ResourceType resourceType, Object... parameter) throws Exception;

}
ServiceURLBuilder
public interface ServiceURLBuilder {
   String build(String ... parameters);
}
RunningJobCallback
/**
 * callback when running job info is ready
 */
public interface RunningJobCallback extends Serializable{
      
   /**
    * this is called when running job resource is ready
    * @param jobContext
    * @param type
    * @param objects
    */
   void onJobRunningInformation(JobContext jobContext, JobConstants.ResourceType type, List<Object> objects);
}
HAURLSelector
public interface HAURLSelector {
   
   boolean checkUrl(String url);
      
   void reSelectUrl() throws IOException;
   
   String getSelectedUrl();
}

 

 

  • No labels