You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Goal

With the introduction of cluster detail discovery and topology generation in Apache Knox 0.14.0, it has become possible to make the configuration for proxying HA-enabled Hadoop services more dynamic/automatic.

Furthermore, it may even be possible for Knox to recognize the HA-enabled configuration for a service, and automatically configure itself to interact with that service in an HA manner.

 

Current HA Configuration

Currently, topology services can accommodate proxying the corresponding HA-enabled cluster services using topology configuration or by leveraging ZooKeeper; the means depends on the cluster service.
For example, the WEBHDFS service is configured with URLs in the service declaration while the HIVE service employs the list of hosts in the configured ZooKeeper ensemble:

HA Service Configurations Topology Excerpt
...
  <provider>
    <role>ha</role>
    <name>HaProvider</name>
    <enabled>true</enabled>
    <param name="WEBHDFS" value="maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true" />
    <param name="HIVE" value="maxFailoverAttempts=3;failoverSleep=1000;enabled=true;zookeeperEnsemble=c6801.ambari.apache.org:2181,c6802.ambari.apache.org:2181,c6803.ambari.apache.org:2181;zookeeperNamespace=hiveserver2" />
  </provider>
</gateway>

<service>
  <role>WEBHDFS</role>
  <url>http://c6801.ambari.apache.org:50070/webhdfs</url>
  <url>http://c6802.ambari.apache.org:50070/webhdfs</url>
</service>
 
<service>
  <role>HIVE</role>
</service>
...

 

Topology-Based HA

Knox already handles the topology-based HA configuration to a degree. At topology generation time, if there are multiple hosts for the associated service, Knox will add the URLs to the <service/> element.
If there is an entry for that service in the HaProvider configuration, then the service will be proxied in an HA manner.

 

ZooKeeper-Based HA

Using Ambari, Knox can actually determine the ZooKeeper configuration for each service dynamically, relieving the administrator from having to explicitly configure this in each topology.

These are some examples of services for which there is ZooKeeper-related information:

Service ConfigPropertyExample Value
hive-sitehive.zookeeper.quorumc6801.ambari.apache.org:2181,c6802.ambari.apache.org:2181,c6803.ambari.apache.org:2181
 hive.server2.zookeeper.namespacehiveserver2
 hive.server2.support.dynamic.service.discoverytrue
hdfs-siteha.zookeeper.quorumc6801.ambari.apache.org:2181,c6802.ambari.apache.org:2181,c6803.ambari.apache.org:2181
 dfs.ha.automatic-failover.enabledtrue (only for "Auto HA")
hbase-sitehbase.zookeeper.quorumc6801.ambari.apache.org:2181,c6802.ambari.apache.org:2181,c6803.ambari.apache.org:2181
 zookeeper.znode.parent/hbase-unsecure

 

Open Items

  • Identify the nature of all the supported services
    • Which ones are purely topology-based?
    • Which ones are ZooKeeper-based?
  • For the ZooKeeper-based-HA services, determine if the ZooKeeper details are available from the service's configuration.
  • Determine how to leverage the cluster discovery data to generate the ZooKeeper HA configuration for the relevant declared topology services.

 

 

 

 

 

 

 

 

 

  • No labels