No. | JIRA | Affected Version(s) | Fix Version | Issue | Steps to workaround this issue | |
---|---|---|---|---|---|---|
14 | - | 2.4.x + | - | Hiveserver2 can send a ton of metrics which causes performance issues with Metrics Collector. Symptoms
How do you find out if this is the issue?
|
| |
13 | AMBARI-20056 | 2.2.2 | 2.5.0 | On large clusters AMS can become in-operational due to store file explosion and no compaction.
Consequence : Large number of store files (~10000) in AMS HBase and AMS shutting down regularly. |
alter 'METRIC_RECORD', CONFIGURATION => {'hbase.hstore.blockingStoreFiles' => '1000', 'hbase.hstore.defaultengine.compactionpolicy.class' => 'org.apache.hadoop.hbase.regionserver.compactions.FIFOCompactionPolicy'} If the above does not solve the issue, the only way to recover the system is to reset the metric system. | |
12 | AMBARI-18093 | 2.2.2 | 2.4.0 | On large clusters, if the TTL of high precision tables is more than 3 days, it leads to too much data and too many regions in AMS HBase. It is better to have a smaller ttl for the higher precision data. The 5 minute aggregate data will still be available for 7 days to work with. |
| |
11 | AMBARI-17779 | 2.2.1, 2.2.2 | 2.4.0 | The HBase normalizer which automatically splits / merges regions based on region size was leveraged in AMS (2.2.1). However, due to over aggressive region splitting by normalizer sometimes, in large clusters it could lead to an explosion of regions. This will eventually lead to AMS crashing every time it starts up. As of 2.2.2, the AMS HBase normalizer cannot not be disabled through AMS configs. | Instructions for disabling normalizer on AMS HBase tables. 1. su ams (kinit if needed)
alter 'METRIC_RECORD', {NORMALIZATION_ENABLED => 'false'} alter 'METRIC_AGGREGATE', {NORMALIZATION_ENABLED => 'false'} alter 'METRIC_RECORD_MINUTE', {NORMALIZATION_ENABLED => 'false'} alter 'METRIC_AGGREGATE_MINUTE', {NORMALIZATION_ENABLED => 'false'} alter 'METRIC_RECORD_HOURLY', {NORMALIZATION_ENABLED => 'false'} alter 'METRIC_AGGREGATE_HOURLY', {NORMALIZATION_ENABLED => 'false'} alter 'METRIC_RECORD_DAILY', {NORMALIZATION_ENABLED => 'false'} alter 'METRIC_AGGREGATE_DAILY', {NORMALIZATION_ENABLED => 'false'}
4. Verify the configuration change took effect. Open the HBase master UI on a browser (http://<collector_host>:61310), and search for the String "NORMALIZATION". It should return no matches. | |
10 | AMBARI-15492 | 2.2.1 | 2.2.2 | Ambari metrics collector shuts down and restarts randomly. The following error is seen in the collector log. ERROR org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer: RECEIVED SIGNAL 15: SIGTERM |
| |
9 | AMBARI-13758 | 2.1.2 and lower | 2.2.0 | When Ambari Metrics collector host is moved from one host to another, host metrics are not seen |
| |
8 | AMBARI-14257 | 2.1.2 | 2.2.0 | Storm metrics are not seen after upgrading to Ambari 2.1.2 | On every host with a storm component [nimbus / supervisor / client] , carry out the following steps.
| |
7 | AMBARI-13798 | 2.1.2 | 2.2.0 | Ambari Metrics service graphs might not show data for certain metrics and the the following error might be seen in the metrics collector log (Ambari 2.1.2, 2.1.2.1). "The time range query for precision table exceeds row count limit, please query aggregate table instead" |
| |
6 | AMBARI-13711 | 2.1.2 | Ambari Metrics Server wont start successfully with Kerberos in Distributed Mode (AMBARI-13711) The problem is, we cannot have separate principals for HBase Master and RS. The Zookeeper ACLs will not allow znode created using one principal to be read by the other unless proper ACL are set. Since master created the znode with a different principal than RS in 2.1.2 this will happen. | Change the AMS configuration to use the Master keytab and principal for RS
Restart the Collector | ||
5 | Metrics data for last one month is missing many data points that should exist. | - check /var/log/ambari-metrics-collector/ambari-metrics-collector.log for metrics data aggregation errors like OutOfOrderScannerNextException or SpoolTooBigToDiskException - Set bigger value for hbase_regionserver_heapsize property in Advanced ams-hbase-env using Ambari Web UI - Restart Metrics Collector. | ||||
4 | 2.0.x and 2.1.0 | AMS HBase does not start after Kerberization in distributed mode, 2.0.x and 2.1.0 (Note: Also look at issue: 1) | Issue 1: (AMBARI-11501) On the ambari server host
Issue 2: (AMBARI-12347)
ams.zookeeper.principal = zookeeper/_HOST@EXAMPLE.COM (Substitute appropriate REALM) ams.zookeeper.keytab = /etc/security/keytabs/zk.service.keytab Note: This is assuming you have a Zookeeper keytab on the host with Metrics Collector. If not you should create one with appropriate permissions. If a keytab already exists, make sure to chmod 440 /etc/security/keytabs/zk.service.keytab Example: ]# klist -kt /etc/security/keytabs/zk.service.keytab Keytab name: FILE:/etc/security/keytabs/zk.service.keytab
| |||
3 | 2.0.0, 2.1.0, 2.1.1 | 2.1.2 | Alter TTL is not supported by the version of Phoenix used with Ambari 2.0.0, 2.1.0, 2.1.1. This property can be modified using HBase Shell command. | ~]$ su - ams ------- HBase output describing table information --------------- hbase(main):009:0> alter 'METRIC_RECORD', { NAME => '0', TTL => 172800} hbase(main):007:0> describe 'METRIC_RECORD' ------- HBase output describing table information with new TTL --------------- | ||
2 | 2.0.x | 2.1.0 | Ambari Metrics service does not work after enabling security with AMS in distributed mode in Ambari 2.0.x |
| ||
1 | AMBARI-10707 | 2.0.x | 2.1.0 | Ambari Metrics service does not work with NN HA in distributed mode in Ambari 2.0.x |
1 Comment
Venkatraman Poornalingam
In case of embedded mode in a kerberized environment, we should not be having hdfs-site.xml & core-site.xml in /etc/ams-hbase/conf & /etc/ambari-metrics-server/conf. If they are present , AMS considers it to be a Kerberos authenticated system and looks for a keytab in the AMS. But currently there isn't one required for AMS in embeddedvmode.In such a situation, starting up AMS collector would fail as follows:
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2979)
Caused by: java.io.IOException: Running in secure mode, but config doesn't have a keytab
at org.apache.hadoop.security.SecurityUtil.login(SecurityUtil.java:236)
Removing the hdfs-site.xml and core-site.xml from the AMS config locations would resolve this issue.