Table of Contents minLevel 2
...
Summary
As of Ambari 2.0,
...
Table of Contents | ||
---|---|---|
|
Summary
Ambari 2.0 added support to the Blueprint functionality which allows for deploying High-Availability (HA) for certain components.
- HDFS NameNode HA
- YARN ResourceManager HA
- HBase RegionServers HA
Support may be added for other Hadoop technologies in later releases.
Compatibility with Ambari UI
...
Blueprints are able to deploy the following components with HA:
- HDFS NameNode HA
- YARN ResourceManager HA
- HBase RegionServers HA
As of Ambari 2.1, Blueprints are able to deploy the following components with HA:
- Hive Components (AMBARI-10489)
- Storm Nimbus (AMBARI-11087)
- Oozie Server (AMBARI-6683)
This functionality currently requires providing fine-grained configurations. This document provides examples.
FAQ
Compatibility with Ambari UI
While this feature does not require the Ambari UI to function, the Blueprints HA feature is completely compatible with the Ambari UI. An HA cluster created via Blueprints can be monitored and configured via the Ambari UI, just as any other Blueprints cluster would function.
...
Expert-Mode Configuration
In Ambari 2.0, the Blueprint support for HA requires the Blueprint to contain exact fine-grained configurations. See the examples below for more detail.
In future releases, we hope to provide a higher-level mode of operations, so that HA can be enabled in a more coarse-grained way.
Supported Stack Versions
This feature is enabled for the HDP 2.1 stack, as well as future versions of the stack. Previous versions of HDP have not been verified for this feature, and may not function as desired. In addition, earlier HDP versions may not include the HA support for the required technology.
Getting Started with Blueprints HA
Start with the tested & working examples below and customize from there.
Blueprint Example: HDFS NameNode HA Cluster
Summary
HDFS NameNode HA allows a cluster to be configured such that a NameNode is not a single point of failure.
For more details on HDFS NameNode HA see the Apache Hadoop documentation.
In an Ambari-deployed HDFS NameNode HA cluster:
...
Supported Stack Versions
This feature is supported as of HDP 2.1 and newer releases. Previous versions of HDP have not been tested with this feature.
Examples
Blueprint Example: HDFS NameNode HA Cluster
HDFS NameNode HA allows a cluster to be configured such that a NameNode is not a single point of failure.
For more details on HDFS NameNode HA see the Apache Hadoop documentation.
In an Ambari-deployed HDFS NameNode HA cluster:
- 2 NameNodes are deployed: an “active” and a “passive” namenode.
- If the active NameNode should stop functioning properly, the passive node’s Zookeeper client will detect this case, and the passive node will become the new active node.
- HDFS relies on Zookeeper to manage the details of failover in these cases.
- The Blueprints HA feature will automatically invoke all required commands and setup steps for an HDFS NameNode HA cluster, provided that the correct configuration is provided in the Blueprint. The shared edit logs of each NameNode are managed by the Quorum Journal Manager, rather than NFS shared storage. The use of NFS shared storage in an HDFS HA setup is not supported by Ambari Blueprints, and is generally not encouraged.
How
The Blueprints HA feature will automatically invoke all required commands and setup steps for an HDFS NameNode HA cluster, provided that the correct configuration is provided in the Blueprint. The shared edit logs of each NameNode are managed by the Quorum Journal Manager, rather than NFS shared storage. The use of NFS shared storage in an HDFS HA setup is not supported by Ambari Blueprints, and is generally not encouraged.
By setting a series of properties in the “hdfs-site” configuration file, a user can configure HDFS NameNode HA to use at most two NameNodes in a cluster. These NameNodes are typically referenced via a logical name, the “nameservice”.
Note that
How
The Blueprints HA feature will automatically invoke all required commands and setup steps for an HDFS NameNode HA cluster, provided that the correct configuration is provided in the Blueprint. The shared edit logs of each NameNode are managed by the Quorum Journal Manager, rather than NFS shared storage. The use of NFS shared storage in an HDFS HA setup is not supported by Ambari Blueprints, and is generally not encouraged.
The following HDFS stack components should be included in any host group in a Blueprint that supports an HA HDFS NameNode:
NAMENODE
ZKFC
ZOOKEEPER_SERVER
JOURNALNODE
Configuring Active and Standby NameNodes
The HDFS “NAMENODE” component must be assigned to two servers, either via two separate host groups, or to a host group that maps to two physical servers in the Cluster Creation Template for this cluster.
By default, the Blueprint processor will assign the “active” NameNode to one host, and the “standby” NameNode to another. The user of an HA Blueprint does not need to configure the initial status of each NameNode, since this can be assigned automatically.
If desired, the user can configure the initial state of each NameNode by adding the following configuration properties in the “hadoop-env” namespace:
dfs_ha_initial_namenode_active - This property should contain the hostname for the “active” NameNode in this cluster.
dfs_ha_initial_namenode_standby - This property should contain the host name for the “passive” NameNode in this cluster.
The following HDFS stack components should be included in any host group in a Blueprint that supports an HA HDFS NameNode:
NAMENODE
ZKFC
ZOOKEEPER_SERVER
JOURNALNODE
Configuring Active and Standby NameNodes
The HDFS “NAMENODE” component must be assigned to two servers, either via two separate host groups, or to a host group that maps to two physical servers in the Cluster Creation Template for this cluster.
By default, the Blueprint processor will assign the “active” NameNode to one host, and the “standby” NameNode to another. The user of an HA Blueprint does not need to configure the initial status of each NameNode, since this can be assigned automatically.
If desired, the user can configure the initial state of each NameNode by adding the following configuration properties in the “hadoop-env” namespace:
dfs_ha_initial_namenode_active - This property should contain the hostname for the “active” NameNode in this cluster.
dfs_ha_initial_namenode_standby - This property should contain the host name for the “passive” NameNode in this cluster.
Note |
---|
These properties should only be used when the initial state of the active or standby |
Note |
These properties should only be used when the initial state of the active or standby NameNodes needs to be configured to a specific node. This setting is only guaranteed to be accurate in the initial state of the cluster. Over time, the active/standby state of each NameNode may change as failover events occur in the cluster. The active or standby status of a NameNode is not recorded or expressed when an HDFS HA Cluster is being exported to a Blueprint, using the Blueprint REST API endpoint. Since clusters change over time, this state is only accurate in the initial startup of the cluster. Generally, it is assumed that most users will not need to choose the active or standby status of each NameNode, so the default behavior in Blueprints HA is to assign the status of each node automatically. |
...
Blueprint Example: Yarn ResourceManager HA Cluster
Summary
Yarn ResourceManager High Availability (HA) adds support for deploying two Yarn ResourceManagers in a given Yarn cluster. This support removes the single point of failure that occurs when single ResourceManager is used.
...
The following link includes an example Blueprint for a 3-node Yarn ResourceManager HA Cluster:
Code Block |
---|
{
"Blueprints": {
"stack_name": "HDP",
"stack_version": "2.2"
},
"host_groups": [
{
"name": "gateway",
"cardinality" : "1",
"components": [
{ "name": "HDFS_CLIENT" },
{ "name": "MAPREDUCE2_CLIENT" },
{ "name": "METRICS_COLLECTOR" },
{ "name": "METRICS_MONITOR" },
{ "name": "TEZ_CLIENT" },
{ "name": "YARN_CLIENT" },
{ "name": "ZOOKEEPER_CLIENT" }
]
},
{
"name": "master_1",
"cardinality" : "1",
"components": [
{ "name": "HISTORYSERVER" },
{ "name": "JOURNALNODE" },
{ "name": "METRICS_MONITOR" },
{ "name": "NAMENODE" },
{ "name": "ZOOKEEPER_SERVER" }
]
},
{
"name": "master_2",
"cardinality" : "1",
"components": [
{ "name": "APP_TIMELINE_SERVER" },
{ "name": "JOURNALNODE" },
{ "name": "METRICS_MONITOR" },
{ "name": "RESOURCEMANAGER" },
{ "name": "ZOOKEEPER_SERVER" }
]
},
{
"name": "master_3",
"cardinality" : "1",
"components": [
{ "name": "JOURNALNODE" },
{ "name": "METRICS_MONITOR" },
{ "name": "RESOURCEMANAGER" },
{ "name": "SECONDARY_NAMENODE" },
{ "name": "ZOOKEEPER_SERVER" }
]
},
{
"name": "slave_1",
"components": [
{ "name": "DATANODE" },
{ "name": "METRICS_MONITOR" },
{ "name": "NODEMANAGER" }
]
}
],
"configurations": [
{
"core-site": {
"properties" : {
"fs.defaultFS" : "hdfs://%HOSTGROUP::master_1%:8020"
}}
},{
"yarn-site" : {
"properties" : {
"hadoop.registry.rm.enabled" : "false",
"hadoop.registry.zk.quorum" : "%HOSTGROUP::master_3%:2181,%HOSTGROUP::master_2%:2181,%HOSTGROUP::master_1%:2181",
"yarn.log.server.url" : "http://%HOSTGROUP::master_2%:19888/jobhistory/logs",
"yarn.resourcemanager.address" : "%HOSTGROUP::master_2%:8050",
"yarn.resourcemanager.admin.address" : "%HOSTGROUP::master_2%:8141",
"yarn.resourcemanager.cluster-id" : "yarn-cluster",
"yarn.resourcemanager.ha.automatic-failover.zk-base-path" : "/yarn-leader-election",
"yarn.resourcemanager.ha.enabled" : "true",
"yarn.resourcemanager.ha.rm-ids" : "rm1,rm2",
"yarn.resourcemanager.hostname" : "%HOSTGROUP::master_2%",
"yarn.resourcemanager.recovery.enabled" : "true",
"yarn.resourcemanager.resource-tracker.address" : "%HOSTGROUP::master_2%:8025",
"yarn.resourcemanager.scheduler.address" : "%HOSTGROUP::master_2%:8030",
"yarn.resourcemanager.store.class" : "org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore",
"yarn.resourcemanager.webapp.address" : "%HOSTGROUP::master_2%:8088",
"yarn.resourcemanager.webapp.https.address" : "%HOSTGROUP::master_2%:8090",
"yarn.timeline-service.address" : "%HOSTGROUP::master_2%:10200",
"yarn.timeline-service.webapp.address" : "%HOSTGROUP::master_2%:8188",
"yarn.timeline-service.webapp.https.address" : "%HOSTGROUP::master_2%:8190"
}
}
}
]
} |
Register Blueprint with Ambari Server
...
Blueprint Example: HBase RegionServer HA Cluster
Summary
HBase provides a High Availability feature for reads across HBase Region Servers.
...