Command Line Options of Nutch version 0.8.x
See each entry for datails of the command arguments and options.
command |
function |
One-step crawler for intranets. |
|
Read / dump crawldb. |
|
Read / dump linkdb. |
|
Inject new urls into the crawldb. |
|
Generate new segments to fetch. |
|
Converts a crawldb from pre 0.9 format. |
|
Fetch a segment's pages. |
|
Parse contents in one segment. |
|
Read data in an existing segment. |
|
Updates the crawldb from a segment. |
|
Create or update a linkdb from a segment or segments. |
|
Run the indexer on a segment's fetcher output. |
|
Merge several segment indexes. |
|
Merge several crawldb-s together. Can be used for filtering out specific content. |
|
Merge several linkdb-s together. Can be used for filtering out specific content. |
|
Merge several input segments into one or more output segments. Can be used for filtering out specific content. |
|
Deletes duplicate documents in a set of segment indexes. |
|
Load a plugin and run one of its classes main(). |
|
Run a search server. |
|
|
|
Other useful commands are also available
See each entry for datails of the command arguments and options.
command |
function |
Commandline interface for doing searches. |
|
Utility for testing url filters. |
|
Lists the most frequent terms in an index. |
|
|
|