...
Name | Alert Message | Description | Threshold | ||
---|---|---|---|---|---|
Capacity Remaining | There is little or no space capacity remaining in HDFS. | Gives warning/critical alert if percentage of available space on all HDFS nodes together is less then upper/lower threshold. | 30 (-Warning ) | ||
Under-Replicated Blocks | Number of under-replicated blocks in the HDFS is too high. | Gives warning/critical alert if percentage of under-replicated blocks is more than lower/upper threshold. | 1-Warning | ||
Corrupted Blocks | There are corrupted file blocks in HDFS. | Gives critical alert if number of corrupted blocks is more than threshold. | 1 | ||
DataNodes Down | A significant number of DataNodes are down in the cluster. | Gives warning/critical alert if percentage of dead HDFS data nodes in cluster is more than lower/upper threshold. | 10-Warning | ||
Failed Jobs | MapReduce jobs are failing too frequently. | Gives warning/critical alert if percentage of map-reduce failed jobs is more than lower/upper threshold. | |||
Hive Metastore State | Hive Metastore server is not running. | Gives critical alert if a Hive Metastore service is unavailable. | |||
HiveServer State | HiveServer service is not running. | Gives critical alert if a Hive Server service is unavailable. | |||
Invalid TaskTrackers | There are TaskTracker nodes which are in the invalid state. | Gives warning alert if there is at least one graylisted task-tracker. Gives critical alert if there is at least one blacklisted task-tracker. | |||
JobTracker Service State | JobTracker service is not running. | Gives critical alert if a JobTracker service is unavailable. | |||
Memory Heap Usage | JobTracker is working under high memory pressure. | Gives warning/critical alert if percentage of used job-tracker memory heap is more than lower/upper threshold. | |||
Memory Heap Usage | NameNode is working under high memory pressure. | Gives warning/critical alert if percentage of used NameNode memory heap is more than lower/upper threshold. | 80 (-Warning ) | ||
NameNode Service State | NameNode service is not running. | Gives critical alert if a NameNode service is unavailable. | |||
Oozie Server Service State | Oozie Server service is not running. | Gives critical alert if a Oozie Server service is unavailable. | |||
Secondary NameNode Service State | Secondary NameNode service is not running. | Gives warning alert if a Secondary NameNode service is unavailable. | |||
TaskTracker Service State |
| Turns TaskTracker service to warning state if the TaskTracker service is unavailable. | |||
TaskTrackers Down | A significant number of TaskTrackers are down in the cluster. | Gives warning/critical alert if percentage of map reduce dead task-trackers is more than lower/upper threshold. | |||
WebHCat Server Service State | WebHCat Server service is not running. | Gives critical alert if a Templeton Server service is unavailable. | Under-Replicated Blocks | Number of under-replicated blocks in the HDFS is too high. | Gives warning/critical alert if percentage of under-replicated blocks is more than lower/upper threshold. |
Viewing
Anchor | ||||
---|---|---|---|---|
|
...