THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
Giraph implementation of Nutch LinkRank Algorithm
Author
Renato Marroquin Mogrovejo - renatoj.marroquin at gmail dot com
Project Aim
- Provide a new implementation of web site ranking to Apache Nutch while offering users the ability to extend ranking algorithms by using Apache Giraph.
Project Objectives
- Fully integrate the LinkRank algorithm developed within the Apache Giraph community into Apache Nutch due to the lack of ranking algorithms in the latest version of Nutch 1.
- Be able to reproduce the example in 3 but using the PageRank implementation in Giraph.
- Study different approaches and possibilities of creating variations of the open source PageRank2 as possible new/future ranking algorithms for Nutch.
Project Scope
- Integrate Apache Giraph's PageRank implementation with Apache Nutch 2.x
- Write an standard API with Apache Giraph to enable users/devs to create/use new algorithms developed with Apache Giraph
References
1 https://wiki.apache.org/nutch/NewScoring
2 https://ilpubs.stanford.edu:8090/422/1/1999-66.pdf
3 http://wiki.apache.org/nutch/NewScoringIndexingExample