Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

 

Info

As of Ambari 2.0, Ambari Blueprints supports deploying HA HDFS, YARN & HBase RegionServers.

Table of Contents
minLevel2

Summary

As of Ambari 2.0 added support to the Blueprint functionality which allows for deploying High-Availability (HA) for certain components.

 

Prior to this functionality, configuring HA required manually using the Ambari Web HA Wizards after deployment of the cluster.

 

The functionality supports these components:

, Blueprints are able to deploy the following components with HA:

  •  HDFS NameNode HA
  • YARN ResourceManager HA
  • HBase RegionServers HA
Support may be added for other Hadoop technologies in later releases.  

 

As of Ambari 2.1, Blueprints are able to deploy the following components with HA:


This functionality currently requires providing fine-grained configurations. This document provides examples.

FAQ

Compatibility with Ambari UI 

While this feature does not require the Ambari UI to function, the Blueprints HA feature is completely compatible with the Ambari UI.  An HA cluster created via Blueprints can be monitored and configured via the Ambari UI, just as any other Blueprints cluster would function.  

...

  

Supported Stack Versions

This feature is supported as of HDP 2.1 and newer releases. Previous versions of HDP have not been tested with this feature.  

Examples

Expert-Mode Configuration

In Ambari 2.0, the Blueprint support for HA requires the Blueprint to contain exact fine-grained configurations. See the examples below for more detail.

In future releases, we hope to provide a higher-level mode of operations, so that HA can be enabled in a more coarse-grained way.  

Supported Stack Versions

This feature is enabled for the HDP 2.1 stack, as well as future versions of the stack.  Previous versions of HDP have not been verified for this feature, and may not function as desired.  In addition, earlier HDP versions may not include the HA support for the required technology.  

Getting Started with Blueprints HA

Start with the tested & working examples below and customize from there.

Blueprint Example: HDFS NameNode HA Cluster

...

HDFS NameNode HA allows a cluster to be configured such that a NameNode is not a single point of failure.

For more details on HDFS NameNode HA see the Apache Hadoop documentation.

...

  • 2 NameNodes are deployed: an “active” and a “passive” namenode.
  • If the active NameNode should stop functioning properly, the passive node’s Zookeeper client will detect this case, and the passive node will become the new active node.
  • HDFS relies on Zookeeper to manage the details of failover in these cases.
  • The Blueprints HA feature will automatically invoke all required commands and setup steps for an HDFS NameNode HA cluster, provided that the correct configuration is provided in the Blueprint.  The shared edit logs of each NameNode are managed by the Quorum Journal Manager, rather than NFS shared storage.  The use of NFS shared storage in an HDFS HA setup is not supported by Ambari Blueprints, and is generally not encouraged.  

By setting a series of properties in the “hdfs-site” configuration file, a user can configure HDFS NameNode HA to use at most two NameNodes in a cluster.  These NameNodes are typically referenced via a logical name, the “nameservice”.

 

Note that 

  • use of NFS shared storage in an HDFS HA setup is not supported by Ambari Blueprints, and is generally not encouraged.  

How

The Blueprints HA feature will automatically invoke all required commands and setup steps for an HDFS NameNode HA cluster, provided that the correct configuration is provided in the Blueprint.  The shared edit logs of each NameNode are managed by the Quorum Journal Manager, rather than NFS shared storage.  The use of NFS shared storage in an HDFS HA setup is not supported by Ambari Blueprints, and is generally not encouraged.  

...

Blueprint Example: Yarn ResourceManager HA Cluster

Summary

Yarn ResourceManager High Availability (HA) adds support for deploying two Yarn ResourceManagers in a given Yarn cluster.  This support removes the single point of failure that occurs when single ResourceManager is used.  

...

Code Block
{
  "Blueprints": {
    "stack_name": "HDP",
    "stack_version": "2.2"
  },
  "host_groups": [
    {
      "name": "gateway",
      "cardinality" : "1",
      "components": [
        { "name": "HDFS_CLIENT" },
        { "name": "MAPREDUCE2_CLIENT" },
        { "name": "METRICS_COLLECTOR" },
        { "name": "METRICS_MONITOR" },
        { "name": "TEZ_CLIENT" },
        { "name": "YARN_CLIENT" },
        { "name": "ZOOKEEPER_CLIENT" }
      ]
    },
    {
      "name": "master_1",
      "cardinality" : "1",
      "components": [
        { "name": "HISTORYSERVER" },
        { "name": "JOURNALNODE" },
        { "name": "METRICS_MONITOR" },
        { "name": "NAMENODE" },
        { "name": "ZOOKEEPER_SERVER" }
      ]
    },
    {
      "name": "master_2",
      "cardinality" : "1",
      "components": [
        { "name": "APP_TIMELINE_SERVER" },
        { "name": "JOURNALNODE" },
        { "name": "METRICS_MONITOR" },
        { "name": "RESOURCEMANAGER" },
        { "name": "ZOOKEEPER_SERVER" }
      ]
    },
    {
      "name": "master_3",
      "cardinality" : "1",
      "components": [
        { "name": "JOURNALNODE" },
        { "name": "METRICS_MONITOR" },
        { "name": "RESOURCEMANAGER" },
        { "name": "SECONDARY_NAMENODE" },
        { "name": "ZOOKEEPER_SERVER" }
      ]
    },
    {
      "name": "slave_1",
      "components": [
        { "name": "DATANODE" },
        { "name": "METRICS_MONITOR" },
        { "name": "NODEMANAGER" }
      ]
    }
  ],
  "configurations": [
    {
      "core-site": {
        "properties" : {
          "fs.defaultFS" : "hdfs://%HOSTGROUP::master_1%:8020"
      }}
    },{
      "yarn-site" : {
        "properties" : {
          "hadoop.registry.rm.enabled" : "false",
          "hadoop.registry.zk.quorum" : "%HOSTGROUP::master_3%:2181,%HOSTGROUP::master_2%:2181,%HOSTGROUP::master_1%:2181",
          "yarn.log.server.url" : "http://%HOSTGROUP::master_2%:19888/jobhistory/logs",
          "yarn.resourcemanager.address" : "%HOSTGROUP::master_2%:8050",
          "yarn.resourcemanager.admin.address" : "%HOSTGROUP::master_2%:8141",
          "yarn.resourcemanager.cluster-id" : "yarn-cluster",
          "yarn.resourcemanager.ha.automatic-failover.zk-base-path" : "/yarn-leader-election",
          "yarn.resourcemanager.ha.enabled" : "true",
          "yarn.resourcemanager.ha.rm-ids" : "rm1,rm2",
          "yarn.resourcemanager.hostname" : "%HOSTGROUP::master_2%",
          "yarn.resourcemanager.recovery.enabled" : "true",
          "yarn.resourcemanager.resource-tracker.address" : "%HOSTGROUP::master_2%:8025",
          "yarn.resourcemanager.scheduler.address" : "%HOSTGROUP::master_2%:8030",
          "yarn.resourcemanager.store.class" : "org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore",
          "yarn.resourcemanager.webapp.address" : "%HOSTGROUP::master_2%:8088",
          "yarn.resourcemanager.webapp.https.address" : "%HOSTGROUP::master_2%:8090",
          "yarn.timeline-service.address" : "%HOSTGROUP::master_2%:10200",
          "yarn.timeline-service.webapp.address" : "%HOSTGROUP::master_2%:8188",
          "yarn.timeline-service.webapp.https.address" : "%HOSTGROUP::master_2%:8190"
        }
      }
    }
  ]
}

...

Blueprint Example: HBase RegionServer HA Cluster

Summary


HBase provides a High Availability feature for reads across HBase Region Servers.  

...