THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- Metrics Collector shuts down intermittently. Since Auto Restart is enabled for Metrics collector by default, this will up show as an alert stating 'Metrics collector has been auto restarted # times the last 1 hour'.
- Partial data is seen.
- All non-aggregated host metrics are seen (HDFS Namenode metrics / Host summary page on Ambari / System - Servers Grafana dashboard).
- Aggregated data is not seen. (AMS Summary page / System - Home Grafana dashboard / HBase - Home Grafana dashboard).
- Aggregations are taking too long (if completing).
- Time
Systematically Troubleshooting the scale issue
- Get the current state of the system
What to get? | How to get? | |
---|---|---|
Question to ask | How do we find the answer? | Fix / Workaround for this issue |
How many metrics are being collected? |
| |
What is the number of regions and store files in AMS HBase? | ||
Is the memory recommendation valid? | ||
This can be got from AMS HBase Master UI. http://<METRICS_COLLECTOR_HOST>:61310 | ||
How long does it take to aggregate | ||
Advanced Configurations
...