You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Introduction

This tutorial covers running Nutch jobs on Apache Tez (instead of MapReduce).

This tutorial is a work in progress and the document should be considered in DRAFT status. The work to evaluate Tez as an appropriate execution engine for Nutch jobs was initiated in December, 2020.

Audience

This tutorial will appeal to Nutch administrators looking to improve runtime speed whilst maintaining MapReduce’s ability to scale to petabytes of data. Readers are encouraged to share their experienced using Nutch on Tez.

What is Apache Tez?

  • No labels