Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Myriad Scheduler can be configured to automatically download and run the Hadoop YARN binaries and get the Hadoop configuration from the resource manager. This means you won't have to install and configure Hadoop YARN on each machine. This information involves bundling Myriad and creating a tarball.

...

Table of Contents

Assumptions

The following are assumptions about your environment:

  • You are using hadoop-2.7.1 0 downloaded from hadoop.apache.org. Specific vendor versions should work but may require additional steps.Hadoop is installed in the
Note

The default location for $YARN_HOME is /opt/hadoop-2.7.

...

0.

Building the Myriad Remote Distribution Bundle

Before building Myriad, configure the Resource Manager as you normally would. Building Myriad involves:

  1. Running ./gradlew build.
  2. Copying the Myriad Scheduler jar files in your YARN classpath.
  3. Placing the Myriad Executor jar files in HDFS.
  4. Modifying the Myriad default configuration file (myriad-config-default.yml).
  5. Modifying the YARN XML file (yarn-site.xml).
  6. Creating the tarball.

 

 

Building the Myriad Remote Distribution Bundle

Step 1: Build Myriad

...

From the project root, build Myriad with the following command:

Code Block
./gradlew build

 

Step 2: Copy the Myriad Schedule Jar Files

Copy the jar files and configuration .yml file onto your YARN classpath:

Code Block
cp myriad-scheduler/build/libs/*.jar /opt/hadoop-2.7.1/share/hadoop/yarn/lib/
cp myriad-scheduler/src/main/resources/myriad-config-default.yml /opt/hadoop-2.7.1/share/hadoop/yarn/lib/

 

Step 3: Put the Myriad Executor Jar File

Put the myriad-executor-runnable-x.x.x.jar file in HDFS.

Code Block
hadoop fs -put myriad-executor/build/libs/myriad-executor-runnable-0.0.1.jar /dist

The gradlew build command builds the three deployable Myriad jars: myriad-commons, myriad-executor, and myriad-scheduler

Step 2

...

: Configure the Myriad Defaults

Edit the /opt/hadoop/$YARN_HOME/etc/hadoop/myriad-config-default.yml file to configure the default parameters. See the sample Myriad configuration file for more information. To enable remote binary distribution, you must set the following options:

 

Code Block
frameworkSuperUser: admin              # Must be root or have passwordless sudo on all nodes!
frameworkUser: hduser                  # Should be the same user running the resource manager.
                                       # Must exist on all nodes and be in the 'hadoop' group
executor: 
  nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.7.10.tar.gz 
path: hdfs://namenode:port/dist/myriad-executor-runnable-0.0.1.jar
yarnEnvironment: 
YARN_HOME: hadoop-2.7.0                # This should be relative if nodeManagerUri is set

Step 3: Deploy the Myriad Jar and Configuration Files

To deploy the Myriad Scheduler and Executor files to the following locations:

Code Block
cp myriad-scheduler/build/libs/*.jar $YARN_HOME/share/hadoop/yarn/lib/
cp myriad-executor/build/libs/myriad-executor-0.1.0.jar $YARN_HOME/share/hadoop/yarn/lib/
cp myriad-scheduler/build/src/main/resources/myriad-config-default.yml $YARN_HOME/etc/hadoop/

Step 4: Deploy Dependent Jars and Ensure Version Compatibility

Important note: the myriad-commons, myriad-executor, and myriad-scheduler jars are all non-shaded. Consequently, there are two steps to deploying the myriad jars to $YARN_HOME/share/hadoop/yarn/lib: (1) verify and update as needed the jars common to the host hadoop distribution and myriad and (2) deploy dependent jars unique to myriad. Failure to perform these two steps will prevent the ResourceManager from starting via bin/yarn resourcemanager command.

Common Dependencies

As of Myriad 0.2.0, the following jars are common to the official Apache Hadoop distribution and Myriad. The versions of these jars must match the Myriad dependency versions, so update the versions in $YARN_HOME/share/hadoop/yarn/lib as needed to make this so.

Info
titleCommon Hadoop and Myriad Dependencies

guice

guice-servlet

jackson-annotations

jackson-core

jackson-databind

jackson-dataformat-yaml

 

Myriad Dependencies

As of Myriad 0.2.0, the following Myriad dependent jars that also must be deployed to the $YARN_HOME/share/hadoop/yarn/lib directory to ensure the ResourceManager starts properly.

Info
titleUnique Myriad Dependencies

commons-lang3

disruptor

guice-multibindings

mesos (Java API)

metrics-core

metrics-healthchecks

 

Step 5: Configure YARN to use Myriad

Modify the /opt/hadoop-2.7.1/$YARN_HOME/etc/hadoop/yarn-site.xml file as instructed in Sample: yarn-site.xml file.

Step 6: Create and Deploy the

...

Tarballs

The binary tarball has all of the files needed for to launch the Node Managers and  Resource Managers. The following shows demonstrates how to create the tarball and place it in HDFS:

Code Block
cd ~
 sudo cp -rp /opt/hadoop-2.7.0 .
 sudo rm ~/hadoop-2.7.0/etc/hadoop/*yarn-site.xml
 sudo tar -zcpf ~/hadoop-2.7.0.tar.gz hadoop-2.7.0
 hadoop fs -put ~/hadoop-2.7.0.tar.gz /dist

The configuration tarball contains all of the hadoop configuration files–including the updated yarn-site.xml as well as the added myriad-config-default.yml files–needed to launch the Node Managers. The following demonstrates how to create the tarball and place it in HDFS:

Code Block

 

...

czf config.tgz etc/hadoop/*
hadoop fs -put $YARN_HOME/dist/config.tgz
 

The Next Steps

You can now start the resource manager and attempt to flexup or flexdown the cluster. See the Administration Getting Started section for information about managing using Myriad. See the Myriad Cluster API for more information about scaling.