The Myriad Scheduler can be configured to automatically download and run the hadoop yarn Hadoop YARN binaries and get the hadoop
Hadoop configuration from the resource manager. This means you won't have to install and configure hadoop yarn Hadoop YARN on each machine.
Note |
---|
This is a very new feature and the configuration options may change dramatically in the future. |
This information involves bundling Myriad and creating a tarball.
Table of Contents |
---|
Assumptions
The following are assumptions about your environment:
- You are using hadoop-2.57.0 downloaded from hadoop.apache.org. Specific vendor versions should work but may require additional steps.Hadoop is installed in `
Note |
---|
The default location for $YARN_HOME is |
...
|
Building the Myriad Remote Distribution Bundle
Before building Myriad, configure the Resource Manager as you normally would. Building Myriad involves:
- Running `./gradlew build`.
- Copying the Myriad Scheduler jar files in your YARN classpath.
- Placing the Myriad Executor jar files in HDFS.
- Configuring the Myriad default configuration file.
- Configuring the YARN XML file.
- Creating the tarball.
Building the Myriad Remote Distribution Bundle
Step 1: Build Myriad
...
From the project root, build Myriad with the following command:
Code Block |
---|
./gradlew build |
Step 2: Copy the Myriad Schedule Jar Files
Copy the jar files and configuration .yml file onto your YARN classpath:
Code Block |
---|
cp myriad-scheduler/build/libs/*.jar /opt/hadoop-2.5.0/share/hadoop/yarn/lib/
cp myriad-scheduler/src/main/resources/myriad-config-default.yml /opt/hadoop-2.5.0/share/hadoop/yarn/lib/ |
Step 3: Put the Myriad Executor Jar File
...
The gradlew build command builds the three deployable Myriad jars: myriad-commons, myriad-executor, and myriad-scheduler
Step 2: Configure the Myriad Defaults
Edit the $YARN_HOME
Code Block |
---|
hadoop fs -put myriad-executor/build/libs/myriad-executor-runnable-0.0.1.jar /dist |
Step 4: Configure the Myriad Configuration File
Edit /opt/hadoop/etc/hadoop/myriad-config-default.yml file to configure the default parameters. See the sample Myriad configuration file for more information. To enable remote binary distribution, you must set the following options:htmlcomment
XREF: For a standard configuration, see [myriad-configuration]({{site.baseurl}}/docs/myriad-configuration-properties/myriad-configuration.md).
Code Block |
---|
frameworkSuperUser: admin # Must be root or have passwordless sudo on all nodes! frameworkUser: hduser # Should be the same user running the resource manager. # Must exist on all nodes and be in the 'hadoop' group executor: nodeManagerUri: hdfs://namenode:port/dist/hadoop-2.57.0.tar.gz path: hdfs://namenode:port/dist/myriad-executor-runnable-0.0.1.jar yarnEnvironment: YARN_HOME: hadoop-2.57.0 # This should be relative if nodeManagerUri is set |
Step 5: Configure the YARN XML File
Step 3: Deploy the Myriad Jar and Configuration Files
To deploy the Myriad Scheduler and Executor files to the following locations:
Code Block |
---|
cp myriad-scheduler/build/libs/*.jar $YARN_HOME/share/hadoop/yarn/lib/
cp myriad-executor/build/libs/myriad-executor-0.1.0.jar $YARN_HOME/share/hadoop/yarn/lib/
cp myriad-scheduler/build/src/main/resources/myriad-config-default.yml $YARN_HOME/etc/hadoop/ |
Step 4: Deploy Dependent Jars and Ensure Version Compatibility
Important note: the myriad-commons, myriad-executor, and myriad-scheduler jars are all non-shaded. Consequently, there are two steps to deploying the myriad jars to $YARN_HOME/share/hadoop/yarn/lib: (1) verify and update as needed the jars common to the host hadoop distribution and myriad and (2) deploy dependent jars unique to myriad. Failure to perform these two steps will prevent the ResourceManager from starting via bin/yarn resourcemanager command.
Common Dependencies
As of Myriad 0.2.0, the following jars are common to the official Apache Hadoop distribution and Myriad. The versions of these jars must match the Myriad dependency versions, so update the versions in $YARN_HOME/share/hadoop/yarn/lib as needed to make this so.
Info | ||
---|---|---|
| ||
guice guice-servlet jackson-annotations jackson-core jackson-databind jackson-dataformat-yaml |
Myriad Dependencies
As of Myriad 0.2.0, the following Myriad dependent jars that also must be deployed to the $YARN_HOME/share/hadoop/yarn/lib directory to ensure the ResourceManager starts properly.
Info | ||
---|---|---|
| ||
commons-lang3 disruptor guice-multibindings mesos (Java API) metrics-core metrics-healthchecks |
Step 5: Configure YARN to use Myriad
Modify the $YARN_HOMEConfigure /opt/hadoop-2.5.0/etc/hadoop/yarn-site.xml file as instructed in Myriad Configuration Properties Sample: yarn-site.xml file.
Step 6: Create and Deploy the
...
Tarballs
The binary tarball has all of the files needed to launch the Node Managers. The following demonstrates how to create Create the tarball and place it in HDFS:
Code Block |
---|
cd ~ sudo cp -rp /opt/hadoop-2.57.0 . sudo rm ~/hadoop-2.57.0/etc/hadoop/*yarn-site.xml sudo tar -zcpf ~/hadoop-2.57.0.tar.gz hadoop-2.57.0 hadoop fs -put ~/hadoop-2.57.0.tar.gz /dist |
The configuration tarball contains all of the hadoop configuration files–including the updated yarn-site.xml as well as the added myriad-config-default.yml files–needed to launch the Node Managers. The following demonstrates how to create the tarball and place it in HDFS:
Code Block |
---|
czf config.tgz etc/hadoop/* hadoop fs -put $YARN_HOME/dist/config.tgz |
The Next Steps
You can now start the resource manager and attempt to flex up the cluster!flexup or flexdown the cluster. See the Getting Started section for information about using Myriad. See the Myriad Cluster API for more information about scaling.