...
Name | Alert Message | Description | Threshold | |
---|---|---|---|---|
Capacity Remaining | There is little or no space capacity remaining in HDFS. | Gives warning/critical alert if percentage of available space on all HDFS nodes together is less then upper/lower threshold. | 30-Warning | |
Under-Replicated Blocks | Number of under-replicated blocks in the HDFS is too high. | Gives warning/critical alert if percentage of under-replicated blocks is more than lower/upper threshold. | 1-Warning | |
Corrupted Blocks | There are corrupted file blocks in HDFS. | Gives critical alert if number of corrupted blocks is more than threshold. | 1 | |
DataNodes Down | A significant number of DataNodes are down in the cluster. | Gives warning/critical alert if percentage of dead HDFS data nodes in cluster is more than lower/upper threshold. | 10-Warning | |
Failed Jobs | MapReduce jobs are failing too frequently. | Gives warning/critical alert if percentage of map-reduce failed jobs is more than lower/upper threshold. | 10-Warning | |
Invalid TaskTrackers | There are TaskTracker nodes which are in the invalid state. | Gives critical alert if there is at least one blacklisted task-tracker. | 1 | |
Memory Heap Usage | JobTracker is working under high memory pressure. | Gives warning/critical alert if percentage of used job-tracker memory heap is more than lower/upper threshold. | 80-Warning | |
Memory Heap Usage | NameNode is working under high memory pressure. | Gives warning/critical alert if percentage of used NameNode memory heap is more than lower/upper threshold. | 80-Warning | |
TaskTrackers Down | A significant number of TaskTrackers are down in the cluster. | Gives warning/critical alert if percentage of map reduce dead task-trackers is more than lower/upper threshold. | 10-Warning | |
TaskTracker Service State | TaskTracker component is not running. | Turns TaskTracker service to warning state if the TaskTracker service is unavailable. | N/A | |
NameNode Service State | NameNode service NameNode component is not running. | Gives critical alert if a NameNode service is unavailable. | N/A | |
Secondary NameNode Service State | Secondary | NameNode service NameNode component is not running. | Gives warning alert if a Secondary NameNode service is unavailable. | N/A |
JobTracker Service State | JobTracker service JobTracker component is not running. | Gives critical alert if a JobTracker service is unavailable. | N/A | |
Oozie Server Service State | Oozie Server | service component is not running. | Gives critical alert if a Oozie Server service is unavailable. | N/A |
Hive Metastore State | Hive | Metastore server Metastore component is not running. | Gives critical alert if a Hive Metastore service is unavailable. | N/A |
HiveServer State | HiveServer service HiveServer component is not running. | Gives critical alert if a Hive Server service is unavailable. | N/A | |
WebHCat Server Service State | WebHCat | Server service Server component is not running. | Gives critical alert if a WebHCat Server service is unavailable. | N/A |
Viewing
Anchor | ||||
---|---|---|---|---|
|
...