Building

Build requirements:

  • Java 1.6
  • Maven 3 or higher
  1. Download Giraph from git://git.apache.org/giraph.git. One easy way to do this is:
    git clone git://git.apache.org/giraph.git
    
  2. In the base path, use ‘mvn compile’ to build the giraph jar (will be generated in giraph/target/giraph-{version}-jar-with-dependencies). If you would like to build the jar and run the unittests use ‘mvn package’ instead.

Running an example:

Hadoop requirements:

  • Hadoop 0.20.203 or higher (must contain the necessary security changes)
  1. Build the Giraph jar with dependencies as described above.
  2. In this example, we run a page rank benchmark included with Giraph located in org.apache.giraph.benchmark.PageRankBenchmark. For help on the options, run the following command with the appropriate location changed for your generated jar file:
hadoop jar giraph-0.1-jar-with-dependencies.jar org.apache.giraph.benchmark.PageRankBenchmark -h
usage: org.apache.giraph.benchmark.PageRankBenchmark [-e <arg>] [-h] [-s <arg>] [-v] [-V <arg>] [-w <arg>]
 -e,--edgesPerVertex <arg>      Edges per vertex
 -h,--help                      Help
 -s,--supersteps <arg>          Supersteps to execute before finishing
 -v,--verbose                   Verbose
 -V,--aggregateVertices <arg>   Aggregate vertices
 -w,--workers <arg>             Number of workers

Example page rank benchmark run with 5 million vertices, 3 supersteps, and 30 workers:

$ hadoop jar giraph-0.1-jar-with-dependencies.jar org.apache.giraph.benchmark.PageRankBenchmark -e 1 -s 3 -v -V 50000000 -w 30

11/08/01 20:40:35 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 3635750 for user
11/08/01 20:40:35 INFO security.TokenCache: Got dt for user …
11/08/01 20:40:35 WARN bsp.BspOutputFormat: checkOutputSpecs: ImmutableOutputCommiter will not check anything
11/08/01 20:40:38 INFO mapred.JobClient: Running job: job_201107180643_176350
11/08/01 20:40:39 INFO mapred.JobClient:  map 0% reduce 0%
11/08/01 20:41:06 INFO mapred.JobClient:  map 100% reduce 0%
11/08/01 20:41:38 INFO mapred.JobClient: Job complete: job_201107180643_176350
11/08/01 20:41:38 INFO mapred.JobClient: Counters: 30
11/08/01 20:41:38 INFO mapred.JobClient:   Job Counters 
11/08/01 20:41:38 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=1306584
11/08/01 20:41:38 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
11/08/01 20:41:38 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
11/08/01 20:41:38 INFO mapred.JobClient:     Launched map tasks=31
11/08/01 20:41:38 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
11/08/01 20:41:38 INFO mapred.JobClient:   Giraph Timers
11/08/01 20:41:38 INFO mapred.JobClient:     Total (milliseconds)=38320
11/08/01 20:41:38 INFO mapred.JobClient:     Superstep 3 (milliseconds)=8607
11/08/01 20:41:38 INFO mapred.JobClient:     Setup (milliseconds)=2190
11/08/01 20:41:38 INFO mapred.JobClient:     Shutdown (milliseconds)=230
11/08/01 20:41:38 INFO mapred.JobClient:     Superstep 0 (milliseconds)=2320
11/08/01 20:41:38 INFO mapred.JobClient:     Superstep 4 (milliseconds)=5664
11/08/01 20:41:38 INFO mapred.JobClient:     Superstep 5 (milliseconds)=3181
11/08/01 20:41:38 INFO mapred.JobClient:     Superstep 2 (milliseconds)=6108
11/08/01 20:41:38 INFO mapred.JobClient:     Superstep 1 (milliseconds)=10016
11/08/01 20:41:38 INFO mapred.JobClient:   Giraph Stats
11/08/01 20:41:38 INFO mapred.JobClient:     Aggregate edges=5000000
11/08/01 20:41:38 INFO mapred.JobClient:     Superstep=6
11/08/01 20:41:38 INFO mapred.JobClient:     Current workers=30
11/08/01 20:41:38 INFO mapred.JobClient:     Sent messages=0
11/08/01 20:41:38 INFO mapred.JobClient:     Aggregate finished vertices=5000000
11/08/01 20:41:38 INFO mapred.JobClient:     Aggregate vertices=5000000
11/08/01 20:41:38 INFO mapred.JobClient:   File Output Format Counters 
11/08/01 20:41:38 INFO mapred.JobClient:     Bytes Written=0
11/08/01 20:41:38 INFO mapred.JobClient:   FileSystemCounters
11/08/01 20:41:38 INFO mapred.JobClient:     FILE_BYTES_READ=7470
11/08/01 20:41:38 INFO mapred.JobClient:     HDFS_BYTES_READ=1364
11/08/01 20:41:38 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=948218
11/08/01 20:41:38 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=830011427
11/08/01 20:41:38 INFO mapred.JobClient:   File Input Format Counters 
11/08/01 20:41:38 INFO mapred.JobClient:     Bytes Read=0
11/08/01 20:41:38 INFO mapred.JobClient:   Map-Reduce Framework
11/08/01 20:41:38 INFO mapred.JobClient:     Map input records=31
11/08/01 20:41:38 INFO mapred.JobClient:     Spilled Records=0
11/08/01 20:41:38 INFO mapred.JobClient:     Map output records=0
11/08/01 20:41:38 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1364
  • No labels