Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Hadoop was configured and deployed in pseudo-distributed mode

Code Block
titlehdfs-site.xml
collapsetrue
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

. The following assumes that you have already established a pseudo-distributed cluster and will make the following configuration changes before launching the new cluster.


Code Block
languagexml
titleyarn-site.xml
collapsetrue
<configuration>

<!-- Site specific YARN configuration properties -->
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>

<property>
  <name>yarn.nodemanager.env-whitelist</name>
  <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>

<property>
  <name>yarn.nodemanager.resource.memory-mb</name>
  <value>8000</value>
  <description>Amount of physical memory, in MB, that can be allocated for containers.</description>
</property>

<property>
  <name>yarn.scheduler.minimum-allocation-mb</name>
  <value>500</value>
</property>

<property>
  <description>Indicate to clients whether Timeline service is enabled or not.
  If enabled, the TimelineClient library used by end-users will post entities
  and events to the Timeline server.</description>
  <name>yarn.timeline-service.enabled</name>
  <value>true</value>
</property>

<property>
  <description>The hostname of the Timeline service web application.</description>
  <name>yarn.timeline-service.hostname</name>
  <value>localhost</value>
</property>

<property>
  <description>Value must be the IP:PORT on which timeline server is running.</description>
  <name>yarn.timeline-service.webapp.address</name>
  <value>localhost:8188</value>
</property>

<property>
  <description>Enables cross-origin support (CORS) for web services where
  cross-origin web response headers are needed. For example, javascript making
  a web services request to the timeline server.</description>
  <name>yarn.timeline-service.http-cross-origin.enabled</name>
  <value>true</value>
</property>

<property>
  <description>Publish YARN information to Timeline Server</description>
  <name> yarn.resourcemanager.system-metrics-publisher.enabled</name>
  <value>true</value>
</property>
</configuration>


Code Block
languagexml
titlemapred-site.xml
collapsetrue
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn-tez</value>
    </property>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
    </property>
</configuration>


Code Block
languagebash
titlehadoop-env.sh
collapsetrue
export JAVA_HOME=/path/to/JDK
export HADOOP_HOME=/path/to/hadoop
export TEZ_JARS=/path/to/tez/tez-dist/target/tez-0.10.1-SNAPSHOT
export TEZ_CONF_DIR=/path/to/tez/conf
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_CONF_DIR:$TEZ_JARS/*:$TEZ_JARS/lib/*


You can then start all Hadoop services as follows

Code Block
languagebash
titleStart Hadoop Services
collapsetrue

$HADOOP_HOME/sbin/start-dfs.sh
$HADOOP_HOME/sbin/start-yarn.sh
$HADOOP_HOME/bin/yarn --daemon start timelineserver

Configuring and Deploying Tez

...