You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

The Myriad high availability (HA) feature provides no job failure or downtime in case of failure. In addition, self recovery from a failure is provided to restore it back to a highly available state after the failure.

A Myriad HA environment allows the Node Managers to reconnect to the new Resource Manager instance upon failover.


On failover, the following occurs:

  • Marathon re-launches the Resource Manager as a new task.
  • Mesos-DNS updates the IP address for the Resource Manager Mesos task to the new IP address.

All clients that are connected to Resource Manager continue to work as long as the FQDN (for example, rmapp.marathon.mesos) is used to connect to the Resource Manager.

Prerequisites

  • Deploy mesos-master, mesos-slave (per node), zookeeper, marathon, and mesos-dns on your cluster.

Setting up Mesos-DNS

Mesos-DNS is available on the Mesosphere GitHubFor an online version of the Mesos-DNS documents, see https://mesosphere.github.io/mesos-dns.

  1. Create a directory for Mesos-DNS. For example, /etc/mesos-dns.
  2. Install Mesos-DNS on one node in your cluster.
  3. Configure Mesos-DNS by providing the required parameters in the /etc/mesos-dns/config.json file. See the Mesos-DNS configuration documentation for more information. The following example parameters represent a minimum configuration.

    {
    	"zk": "zk:10.10.100.19:2181/mesos",
    	"refreshSeconds": 60,
    	"ttl": 60,
    	"domain": "mesos",
    	"port": 53,
    	"resolvers": ["10.10.1.10"],
    	"timeout": 5,
    }
  4. If you are on Linux, add the following Mesos-DNS name server to the /etc/resolv.conf file (at the top of the file) on all cluster nodes and clients. For example, clients running RM UI, Myriad UI, and so on.

    nameserver <mesos-dnsIP address>

Add the entries at the top (in the beginning) of the /etc/resolv.conf file. If the entries are not at the top, Mesos-DNS may not work correctly.

 

 

Configuring HA

Configuring Myriad for HA involves adding HA configuration properties to the $YARN_HOME/etc/hadoop/yarn-site.xml file.

To the yarn-site.xml file, add the following properties:

<!--  HA configuration properties -->

<property>
	<name>yarn.resourcemanager.store.class</name>
	<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.MyriadFileSystemRMStateStore</value>
</property>
<property>
	<name>yarn.resourcemanager.fs.state-store.uri</name>
           <!-- Path on HDFS, MapRFS etc -->
	<value>/var/mapr/cluster/yarn/rm/system</value>
</property>
<property>
	<name>yarn.resourcemanager.recovery.enabled</name>
	<value>true</value>
</property>
<!-- If using MapR distro
 <property>
	<name>yarn.resourcemanager.ha.custom-ha-enabled</name>
	<value>false</value>
 </property> -->

 

Launching Resource Manager

Launch the Resource Manager using Marathon. When launching, specify the yarn.resourcemanager.hostname property. The hostname is the ID field specified when launching a Marathon application.

env && export YARN_RESOURCEMANAGER_OPTS=-Dyarn.resourcemanager.hostname=rmapp.marathon.mesos && yarn resourcemanager

Some applications might require the yarn.resourcemanager.hostname property to be specified explicitly as a command line option.

 

 

 

  • No labels