Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

hadoop profile doc outlines the methodology for Java Map reduce jobs.
The following command line is a sample one for use with pig:
java -Dmapred.task.profile.maps=0-0 -Dmapred.tasks.profile.reduces=0-0 -Dmapred.task.profile=true -Dmapred.task.profile.params=-agentlib:hprof=cpu=samples,heap=sites,force=n,thread=y,verbose=n,file=%s<-agent......> -cp <pig.jar pathname>:<dir containing of hadoop-site.xml> org.apache.pig.Main <pig script>

The <-agent..> is the relevant profiler specific option you would supply on the java command line

...

  • Same steps as in CPU profiling with modified commandline to NOT disable memory allocation tracing:
    java -Dmapred.task.profile.maps=0-0 -Dmapred.task.profile.reduces=0-0 -Dmapred.task.profile=true -Dmapred.task.profile.params=-agentpath:CLUSTER_BASEDIR/libyjpagent.so=dir=/tmp/yourkit_snapnshot,sampling,disablej2ee -cp <pig.jar pathname>:<dir containing of hadoop-site.xml> org.apache.pig.Main <pig script>
GUI
  • The GUI to view the profile output is present in: BASEDIR/yjp-7.0.7/bin/yjp.sh
  • org.apache is not something most yourkit users are interested in exploring, so they are filtered out by default in the display. You need to click on Settings | Filters, and uncheck org.apache .
  • On mac, the GUI does not work with java 1.6 . If you have java 1.6 as default, set export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.5/Home/ , to use 1.5 instead .