Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

  1. Move to the pigtmp directory.
  2. Review Pig Script 1 and Pig Script 2.
  3. Execute the following command (using either script1-local.pig or script2-local.pig).
    Code Block
    $ java -cp $PIGDIR/pig.jar org.apache.pig.Main -x local script1-local.pig
    
  4. Review the result file (either script1-local-results.txt or script2-local-results.txt):
    Code Block
    
    $ ls -l script1-local-results.txt
    $ cat script1-local-results.txt
    

Pig Scripts: Hadoop Mode

To run the Pig scripts in hadoop (mapreduce) mode, do the following:

...

Code Block
$ hadoop fs -ls script1-hadoop-results
$ hadoop fs -cat 'script1-hadoop-results/*' | less

<<Anchor(

Anchor
Pig_Tutorial_File
Pig_Tutorial_File
)>>

Pig Tutorial File

The contents of the Pig tutorial file (pigtutorial.tar.gz) are described here.

File

Description

pig.jar

Pig JAR file

tutorial.jar

User-defined functions (UDFs) and Java classes

script1-local.pig

Pig Script 1, Query Phrase Popularity (local mode)

script1-hadoop.pig

Pig Script 1, Query Phrase Popularity (Hadoop cluster)

script2-local.pig

Pig Script 2, Temporal Query Phrase Popularity (local mode)

script2-hadoop.pig

Pig Script 2, Temporal Query Phrase Popularity (Hadoop cluster)

excite-small.log

Log file, Excite search engine (local mode)

excite.log.bz2

Log file, Excite search engine (Hadoop cluster)

A better-documented version of script1-local.pig can be found at https://cwiki.apache.org/confluence/download/attachments/27822259/script1-local-with-added-documentation.pig . It includes comments showing samples from each intermediate relation.

The user-defined functions (UDFs) are described here.

...