Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
./gradlew build

or to specify your exact hadoop version

Code Block
./gradlew -PhadoopVer=X.Y.Z build (replace X.Y.Z with your hadoop version)

At this point, the jars are located in the following directories (relative to $PROJECT_HOME).

...

Before building Myriad Scheduler, modify the myriad-config-default.yml file with the appropriate configuration properties. The build process ./grandlew gradlew build command builds the myriad-x.x.x.jar file, downloads the runtime jars, and places them inside the ./build/libs/ directory (relative to the $PROJECT_HOME/myriad-scheduler directory).

To build Myriad Scheduler, from $PROJECT_HOME/myriad-scheduler, run:

...

Building Myriad Executor Only

The ./grandlew gradlew build command builds the myriad-executor-xxx.jar (as a self-contained executor jar file) and places it inside the PROJECT_HOME/myriad-executor/build/libs/ directory.

To build Myriad Executor individually , as a self-contained executor jar, from $PROJECT_HOME/myriad-executor, run:

Code Block
languagebash
./gradlew build

 

Step 2: Deploy the Myriad Jar and Configuration Files

To deploy Myriad Scheduler and Common, Executor, copy the Myriad and Scheduler and Executor jar files and the Myriad configuration file, myriad-config-default.yml, into your YARN classpath:

  1. Copy the Myriad Scheduler jar files from the $PROJECT_HOME/myriad-scheduler/build/libs/ directory to the YARN ResourceManager's class path on all nodes in your cluster. For example, copy all jars to the $YARN_HOME/share/hadoop/yarn/lib/ directory.On each Mesos slave node, create a mesos directory under the /usr/local/libexec directory. directory on all nodes in your cluster. 
  2. Copy the Myriad Executor myriad-executor-xxx.jar file from the $PROJECT_HOME/myriad-executor/build/libs/ to  directory to each mesos slave's /$YARN_HOME/share/hadoop/yarn/lib/s directory.

To deploy the Myriad configuration file:

  1. Copy the myriad-config-default.yml file from $PROJECT_HOME/myriad-scheduler/build/src/main/resources/ directory to the $YARN_HOME/etc/hadoop  cp myriad-executor/build/libs/myriad-executor-0.0.1.jar /opt/hadoop-2.7.1/share/hadoop/yarn/lib/ directory.

For example:

Code Block
cp myriad-scheduler/build/libs/*.jar /opt/hadoop-2.7.10/share/hadoop/yarn/lib/
cp myriad-executor/build/libs/myriad-executor-0.1.0.1.jar /opt/hadoop-2.7.10/share/hadoop/yarn/lib/
cp myriad-scheduler/build/src/main/resources/myriad-config-default.yml /opt/hadoop-2.7.10/shareetc/hadoop/yarn/lib/
Note

For advanced users, you can also copy myriad-executor-xxx.jar to any other directory on a slave filesystem or it can be copied to HDFS as well. In either case, you need to update the executor's path property in the myriad-config-default.yml file and prepend the path with either file:// or hdfs://, as appropriate. 

...

Step 3: Configure the Myriad Defaults

Myriad configuration parameters must be specified before building Myriad. These configuration parameters are specified in the the myriad-config-default.yml file in the $PROJECT_HOME/myriad-scheduler/src/main directory.  This is required because the myriad-config-default.yml file is embedded into the Myriad Scheduler jar. As As a minimum, the following Myriad configuration parameters must be set:

  • mesosMaster
  • zkServers
  • YARN_HOME

...

Note

Enabling Cgroups involves modifying the yarn-site.xml and myriad-config-default.yml files. If you plan on using Cgroups, you could set that property at this time. See Configuring Cgroups for more information.

Note

By copying the myriad-config-default.yml file to the /etc/hadoop directory, you can make changes to the configuration file without having to rebuild Myriad. If you specify the Myriad configuration parameters before building Myriad, you must rebuild Myriad and redeploy the jar files. This is required because the myriad-config-default.yml file is embedded into the Myriad Scheduler jar. 

 

Step 4: Configure YARN to use Myriad

In order to run Myriad, the following YARN properties must be modified :

...

on

...

YARN Node Manager resource properties to the $YARN_HOME/etc/hadoop/yarn-site.xml file on all nodes.

...

Dynamic port assignment to the mapred-site.xml file on all nodes.

Add MESOS_NATIVE_JAVA_LIBRARY Environment Variable

On each node in the cluster, edit :

  • Edit the $YARN_HOME/etc/hadoop/hadoop-env.sh file and add the following:

    Code Block
    export MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos.so

 

Add YARN Node Manager Resource Properties

...

  • Edit the $YARN_HOME/etc/hadoop/yarn-site.xml file and add the following:

    Code Block
    <property>
        <name>yarn.nodemanager.resource.cpu-vcores</name>
        <value>${nodemanager.resource.cpu-vcores}</value>
    </property>
    <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>${nodemanager.resource.memory-mb}</value>
    </property>
    <!--These options enable dynamic port assignment by mesos -->
    <property>
        <name>yarn.nodemanager.address</name>
        <value>${myriad.yarn.nodemanager.address}</value>
    </property>
    <property>
        <name>yarn.nodemanager.webapp.address</name>
        <value>${myriad.yarn.nodemanager.webapp.address}</value>
    </property>
    <property>
        <name>yarn.nodemanager.webapp.https.address</name>
        <value>${myriad.yarn.nodemanager.webapp.address}</value>
    </property>
    <property>
        <name>yarn.nodemanager.localizer.address</name>
        <value>${myriad.yarn.nodemanager.localizer.address}</value>
    </property>
    <!-- Configure Myriad Scheduler here -->
    <property>
        <name>yarn.resourcemanager.scheduler.class</name>
        

...

  • <value>org.

...

  • apache.myriad.scheduler.yarn.MyriadFairScheduler</value>
        <description>One can configure other scehdulers as well from following list: 

...

  • org.

...

  • apache.myriad.scheduler.yarn.MyriadCapacityScheduler, 

...

  • org.

...

  • apache.myriad.scheduler.yarn.MyriadFifoScheduler</description>
    </property>
    <!-- Disable PMem/VMem checks for Hadoop 2.7.2 -->
    <property>
    

...

 

Enable Dynamic Port Assignment

...

  •     <name>yarn.nodemanager.pmem-check-enabled</name>
        <value>false</value>
    </property>
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
    </property>
  • Edit the $YARN_HOME/etc/hadoop

...

  • /mapred-site.xml file

...

  • and add the dynamic port assignment

...

  • properties.
    1. On each node, change directory to $YARN_HOME/etc/hadoop.
    2. Copy mapred-site.xml.template to mapred-site.xml.
    3. Edit and add the following property to the mapred-site.xml file.
    1.  
Code Block
// Add following to $YARN_HOME/etc/hadoop/mapred-site.xml:

<!--This option enables dynamic port assignment by mesos -->
<property>
    <name>mapreduce.shuffle.port</name>
    <value>${myriad.mapreduce.shuffle.port}</value>
</property>

Step 5: Create Hadoop Deployment Tar Files

There are two tar files uploaded to Mesos when the ResourceManager is started: (1) configuration–all of the hadoop configuration files in the $HADOOP_CONF_DIR/ and (2) binary–contains all files of the $HADOOP_HOME directory.

In order to prepare and generate the configuration and binary tar files, do the following:

  • Navigate to the parent directory of $HADOOP_HOME (usually  /usr/local or /opt) and then generate the binary tar gz file as follows:

Code Block
cd /opt
tar czf binary.tgz hadoop-2.6.0

 

  • Navigate to the $HADOOP_CONF_DIR and generate the config tar gz file as follows:

Code Block
tar czf config.tgz hadoop


Step 6: Configure Resource Manager for Deployment to Mesos

Now that the binary.tgz and config.tgz files are all set, specify in the myriad-config-default.yml the relative path to each file from where the Resource Manager is to be launched.  A common pattern is to create a dist directory within $HADOOP_HOME. The configuration would be as follows:

Code Block
servedConfigPath: dist/config.tgz
servedBinaryPath: dist/binary.tgz

 

Starting the Resource Manager

Myriad Scheduler runs inside the Resource Manager as a plug-in. To start Start the Resource Manager within the $HADOOP_HOME directory with either of the two following commands:

Code Block
languagebash
./sbin/yarn-daemon.sh start resourcemanager  resourcemanager 
OR
bin/yarn resourcemanager

 

Running Myriad Executor and Node Managers

...