...
Hadoop was configured and deployed in pseudo-distributed mode
Code Block |
---|
title | hdfs-site.xml |
---|
collapse | true |
---|
|
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration> |
. The following assumes that you have already established a pseudo-distributed cluster and will make the following configuration changes before launching the new cluster.
Code Block |
---|
language | xml |
---|
title | yarn-site.xml |
---|
collapse | true |
---|
|
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8000</value>
<description>Amount of physical memory, in MB, that can be allocated for containers.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>500</value>
</property>
<property>
<description>Indicate to clients whether Timeline service is enabled or not.
If enabled, the TimelineClient library used by end-users will post entities
and events to the Timeline server.</description>
<name>yarn.timeline-service.enabled</name>
<value>true</value>
</property>
<property>
<description>The hostname of the Timeline service web application.</description>
<name>yarn.timeline-service.hostname</name>
<value>localhost</value>
</property>
<property>
<description>Value must be the IP:PORT on which timeline server is running.</description>
<name>yarn.timeline-service.webapp.address</name>
<value>localhost:8188</value>
</property>
<property>
<description>Enables cross-origin support (CORS) for web services where
cross-origin web response headers are needed. For example, javascript making
a web services request to the timeline server.</description>
<name>yarn.timeline-service.http-cross-origin.enabled</name>
<value>true</value>
</property>
<property>
<description>Publish YARN information to Timeline Server</description>
<name> yarn.resourcemanager.system-metrics-publisher.enabled</name>
<value>true</value>
</property>
</configuration> |
Code Block |
---|
language | xml |
---|
title | mapred-site.xml |
---|
collapse | true |
---|
|
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn-tez</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration> |
Code Block |
---|
language | bash |
---|
title | hadoop-env.sh |
---|
collapse | true |
---|
|
export JAVA_HOME=/path/to/JDK
export HADOOP_HOME=/path/to/hadoop
export TEZ_JARS=/path/to/tez/tez-dist/target/tez-0.10.1-SNAPSHOT
export TEZ_CONF_DIR=/path/to/tez/conf
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$TEZ_CONF_DIR:$TEZ_JARS/*:$TEZ_JARS/lib/* |
You can then start all Hadoop services as follows
Code Block |
---|
language | bash |
---|
title | Start Hadoop Services |
---|
collapse | true |
---|
|
$HADOOP_HOME/sbin/start-dfs.sh
$HADOOP_HOME/sbin/start-yarn.sh
$HADOOP_HOME/bin/yarn --daemon start timelineserver |
Configuring and Deploying Tez
...