Suggested Memory settings
Cluster Size | Recommended Mode | Collector Heapsize ams-env : metrics_collector_heapsize | HBase Master Heapsize ams-hbase-env : hbase_master_heapsize | HBase RS Heapsize ams-hbase-env : hbase_regionserver_heapsize | HBase Master xmn size ams-hbase-env : hbase_master_xmn_size | HBase RS xmn size ams-hbase-env : regionserver_xmn_size |
---|---|---|---|---|---|---|
1 - 10 | Embedded | 512 | 1408 | 512 | 192 | - |
11 - 20 | Embedded | 1024 | 1920 | 512 | 256 | - |
21 - 100 | Embedded | 1664 | 5120 | 512 | 768 | - |
100 - 300 | Embedded | 4352 | 13056 | 512 | 2048 | - |
300 - 500 | Distributed | 4352 | 512 | 13056 | 102 | 2048 |
500 - 800 | Distributed | 7040 | 512 | 21120 | 102 | 3072 |
800 - 1000 | Distributed | 11008 | 512 | 32768 | 102 | 5120 |
1000+ | Distributed | 13696 | 512 | 32768 | 102 | 5120 |
Identifying and tackling scale problems in AMS through configs
How do we find out if AMS is experiencing scale problems?
One or more of the following consequences can be seen on the cluster.
- Metrics Collector shuts down intermittently. Since Auto Restart is enabled for Metrics collector by default, this will up show as an alert stating 'Metrics collector has been auto restarted # times the last 1 hour'.
- Partial metrics data is seen.
- All non-aggregated host metrics are seen (HDFS Namenode metrics / Host summary page on Ambari / System - Servers Grafana dashboard).
- Aggregated data is not seen. (AMS Summary page / System - Home Grafana dashboard / HBase - Home Grafana dashboard).
Systematically Troubleshooting the scale issue
- Get the current state of the system
What to get? | How to get? | Is there a Red flag? | |
---|---|---|---|
How long does it take for 2 min aggregator to finish? | grep "TimelineMetricClusterAggregatorSecond" /var/log/ambari-metrics-collector/ambari-metrics-collector.log? | ||
How many metrics are being collected? |
| >15000 could be a problem. Find the component contributing maximum to the number of metrics d | Find the component contributing maximum to the numbe |
What is the number of regions and store files in AMS HBase? | This can be got from AMS HBase Master UI. http://<METRICS_COLLECTOR_HOST>:61310 | ||
Advanced Configurations
Configuration | Property | Description | Minimum Recommended values (Host Count => MB) |
---|---|---|---|
ams-site | phoenix.query.maxGlobalMemoryPercentage | Percentage of total heap memory used by Phoenix threads in the Metrics Collector API/Aggregator daemon. | 20 - 30, based on available memory. Default = 25. |
ams-site | phoenix.spool.directory | Set directory for Phoenix spill files. (Client side) | Set this to different disk from hbase.rootdir dir if possible. |
ams-hbase-site | phoenix.spool.directory | Set directory for Phoenix spill files. (Server side) | Set this to different disk from hbase.rootdir dir if possible. |
ams-hbase-site | phoenix.query.spoolThresholdBytes | Threshold size in bytes after which results from parallelly executed query results are spooled to disk. | Set this to higher value based on available memory. Default is 12 mb. |