Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The following table lists performance rules that have default intervals for alert checks that might require additional tuning to suit your environment. Evaluate these rules to determine whether the default intervals are appropriate for your environment. If a default interval is not appropriate for your environment, you should obtain a baseline for the relevant performance counters, and then adjust the interval by applying an override to them.

Name

Description

Default threshold Interval (secs)

Collect HDFS Blocks Read

This rule collects amount of heap memory used by Host Component.

900

Collect HDFS Blocks Written

This rule collects amount of non-heap memory committed to Host Component.

900

Collect HDFS Bytes Read

This rule collects amount of non-heap memory used by Host Component.

900

Collect HDFS Bytes Written

This rule collects number of garbage collections performed for Host Component process.

900

Collect HDFS Capacity Non-DFS Used (GB)

This rule collects number of blocked threads for Host Component process.

900

Collect HDFS Capacity Remaining (GB)

This rule collects number of new threads for Host Component process.

900

Collect HDFS Capacity Total (GB)

This rule collects number of runnable threads for Host Component process.

900

Collect HDFS Capacity Used (GB)

This rule collects number of terminated threads for Host Component process.

900

Collect HDFS Corrupted Blocks

This rule collects number of timed waiting threads for Host Component process.

900

Collect HDFS Dead DataNodes

This rule collects number of waiting threads for Host Component process.

900

Collect HDFS Decommissioned DataNodes

This rule collects time spent in garbage collection of Host Component process.

900

Collect HDFS Files Appended

This rule collects number of dead TaskTrackers for cluster.

900

Collect HDFS Files Created

This rule collects number of completed MapReduce jobs for cluster.

900

Collect HDFS Files Deleted

This rule collects number of failed MapReduce jobs for cluster.

900

Collect HDFS Live DataNodes

This rule collects percent of failed MapReduce jobs in cluster.

900

Collect HDFS Missing Blocks

This rule collects number of killed MapReduce jobs for cluster.

900

Collect HDFS Pending Deletion Blocks

This rule collects number of preparing MapReduce jobs for cluster.

900

Collect HDFS Pending Replication Blocks

This rule collects number of running MapReduce jobs for cluster.

900

Collect HDFS Total Blocks

This rule collects number of submitted MapReduce jobs for cluster.

900

Collect HDFS Total Files

This rule collects number of live TaskTrackers for cluster.

900

Collect HDFS Under-Replicated Blocks

This rule collects number of reserved map slots for cluster.

900

Collect Live vs Dead DataNodes Widget Data

This rule collects number of completed maps tasks for cluster.

900

Collect Space Utilization Widget Data

This rule collects number of failed map tasks for cluster.

900

Collect JVM Errors Logged

This rule collects number of killed map tasks for cluster.

900

Collect JVM Fatal Errors Logged

This rule collects number of launched map tasks for cluster.

900

Collect JVM Heap Memory Committed

This rule collects total number of TaskTrackers in cluster.

900

Collect JVM Heap Memory Used

This rule collects number of occupied map slots for cluster.

900

Collect JVM Non Heap Memory Committed

This rule collects number of occupied reduce slots for cluster.

900

Collect JVM Non Heap Memory Used

This rule collects number of reserved reduce slots for cluster.

900

Collect JVM Number of Garbage Collections

This rule collects number of completed reduce tasks for cluster.

900

Collect JVM Threads Blocked

This rule collects number of failed reduce tasks for cluster.

900

Collect JVM Threads New

This rule collects number of killed reduce tasks for cluster.

900

Collect JVM Threads Runnable

This rule collects number of launched reduce tasks for cluster.

900

Collect JVM Threads Terminated

This rule collects number of running map tasks for cluster.

900

Collect JVM Threads Timed Waiting

This rule collects number of running reduce tasks for cluster.

900

Collect JVM Threads Waiting

This rule collects number of blacklisted TaskTrackers in cluster.

900

Collect JVM Time Spent in Garbage Collection (ms)

This rule collects number of decommissioned TaskTrackers in cluster.

900

Collect MapReduce Dead TaskTrackers

This rule collects number of graylisted TaskTrackers in cluster.

900

Collect MapReduce Jobs Completed

This rule collects number of waiting map tasks for cluster.

900

Collect MapReduce Jobs Failed

This rule collects number of waiting reduce tasks for cluster.

900

Collect MapReduce Jobs Failed (%)

This rule collects bytes received by Host Component.

900

Collect MapReduce Jobs Killed

This rule collects bytes sent by Host Component.

900

Collect MapReduce Jobs Preparing

This rule collects queue average time (ms) of remote procedure calls to Host Component.

900

Collect MapReduce Jobs Running

This rule collects number of failed remote procedure call authorization attempts to Host Component.

900

Collect MapReduce Jobs Submitted

This rule collects average processing time (ms) of remote procedure calls to Host Component.

900

Collect MapReduce Live TaskTrackers

This rule collects number of processing remote procedure calls to Host Component.

900

Collect MapReduce Map Slots Reserved

This rule collects number of queued remote procedure calls to Host Component.

900

Collect MapReduce Maps Completed

This rule collects number of available map slots on TaskTracker.

900

Collect MapReduce Maps Failed

This rule collects number of available reduce slots on TaskTracker.

900

Collect MapReduce Maps Killed

This rule collects number of running map tasks on TaskTracker.

900

Collect MapReduce Maps Launched

This rule collects number of running reduce tasks on TaskTracker.

900

Collect MapReduce Number of TaskTrackers

This rule collects number of caught exceptions for shuffle running on TaskTracker.

900

Collect MapReduce Occupied Map Slots

This rule collects number of failed outputs for shuffle running on TaskTracker.

900

Collect MapReduce Reduced Slots Occupied

This rule collects percentage of busy shuffle handlers on TaskTracker.

900

Collect MapReduce Reduced Slots Reserved

This rule collects number of bytes produced by shuffle running on TaskTracker.

900

Collect MapReduce Reduces Completed

This rule collects number of successful outputs for shuffle running on TaskTracker.

900

Collect MapReduce Reduces Failed

This rule collects amount of heap memory used by Host Component.

900

Collect MapReduce Reduces Killed

This rule collects amount of non-heap memory committed to Host Component.

900

Collect MapReduce Reduces Launched

This rule collects amount of non-heap memory used by Host Component.

900

Collect MapReduce Running Map Tasks

This rule collects number of garbage collections performed for Host Component process.

900

Collect MapReduce Running Reduce tasks

This rule collects number of blocked threads for Host Component process.

900

Collect MapReduce TaskTrackers Blacklisted

This rule collects number of new threads for Host Component process.

900

Collect MapReduce TaskTrackers Decommissioned

This rule collects number of runnable threads for Host Component process.

900

Collect MapReduce TaskTrackers Graylisted

This rule collects number of terminated threads for Host Component process.

900

Collect MapReduce Waiting Map Tasks

This rule collects number of timed waiting threads for Host Component process.

900

Collect MapReduce Waiting Reduce tasks

This rule collects number of waiting threads for Host Component process.

900

Collect Network Bytes Received

This rule collects time spent in garbage collection of Host Component process.

900

Collect Network Bytes Sent

This rule collects number of dead TaskTrackers for cluster.

900

Collect Queue Average Wait Time

This rule collects number of completed MapReduce jobs for cluster.

900

Collect RPC Authorization Failures

This rule collects number of failed MapReduce jobs for cluster.

900

Collect RPC Processing Average Time

This rule collects percent of failed MapReduce jobs in cluster.

900

Collect RPC Processing Number of Operations

This rule collects number of killed MapReduce jobs for cluster.

900

Collect RPC Queue Number of Operations

This rule collects number of preparing MapReduce jobs for cluster.

900

Collect TaskTracker Map Slots

This rule collects number of running MapReduce jobs for cluster.

900

Collect TaskTracker Reduce Slots

This rule collects number of submitted MapReduce jobs for cluster.

900

Collect TaskTracker Running Map Tasks

This rule collects number of live TaskTrackers for cluster.

900

Collect TaskTracker Running Reduce tasks

This rule collects number of reserved map slots for cluster.

900

Collect TaskTracker Shuffle Exceptions Caught

This rule collects number of completed maps tasks for cluster.

900

Collect TaskTracker Shuffle Failed Outputs

This rule collects number of failed map tasks for cluster.

900

Collect TaskTracker Shuffle Handler Busy (%)

This rule collects number of killed map tasks for cluster.

900

Collect TaskTracker Shuffle Output Bytes

This rule collects number of launched map tasks for cluster.

900

Collect TaskTracker Shuffle Success Outputs

This rule collects total number of TaskTrackers in cluster.

900