...
First install Tez and ensure you can run the examples. Then progress to the following
Code Block |
---|
language | bash |
---|
title | copy tez-0.10.0-SNAPSHOT to HDFS |
---|
collapse | true |
---|
|
hadoop fs -copyFromLocal tez/tez-dist/target/tez-0.10.1-SNAPSHOT.tar.gz /apps/tez-0.10.1-SNAPSHOT/ |
You can then evolve the configuration to the tez-site.xml below
Code Block |
---|
language | xml |
---|
title | tez-site.xml |
---|
collapse | true |
---|
|
<configuration>
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/apps/tez-0.10.1-SNAPSHOT/tez-0.10.1-SNAPSHOT.tar.gz#tez,${fs.defaultFS}/apps/nutch/apache-nutch-1.18-SNAPSHOT-bin.tar.gz#nutch</value>
</property>
<property>
<name>tez.lib.uris.classpath</name>
<value>./tez/tez-0.10.1-SNAPSHOT/*:./tez/tez-0.10.1-SNAPSHOT/lib/*:./nutch/apache-nutch-1.18-SNAPSHOT/*:./nutch/apache-nutch-1.18-SNAPSHOT/conf/*:./nutch/apache-nutch-1.18-SNAPSHOT/lib/*:./nutch/apache-nutch-1.18-SNAPSHOT/plugins/*/*</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>true</value>
</property>
<property>
<name>plugin.folders</name>
<value>nutch/apache-nutch-1.18-SNAPSHOT/plugins</value>
</property>
<property>
<description>Enable Tez to use the Timeline Server for History Logging</description>
<name>tez.history.logging.service.class</name>
<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
<property>
<description>URL for where the Tez UI is hosted</description>
<name>tez.tez-ui.history-url.base</name>
<value>http://localhost:8080/tez-ui-0.10.1-SNAPSHOT</value>
</property>
<property>
<name>tez.runtime.convert.user-payload.to.history-text</name>
<value>true</value>
</property>
</configuration> |
Configuring and Deploying Nutch
Code Block |
---|
language | bash |
---|
title | Build Nutch and Copy to HDFS |
---|
collapse | true |
---|
|
ant clean tar-binhadoop fs -copyFromLocal dist/apache |
Evaluating Tez as a Replacement for MapReduce
...