Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Configuration's list


No Format

GET /config


Response contains names of available configurations.

...

Configuration parameters


No Format

GET /config/{configuration name}

Examples:
GET /config/default
GET /config/custom-config

...

Creates new Nutch configuration with given parameters.

No Format

POST /config/create

Examples:
POST /config/create
   {
      "configId":"new-config",
      "params":{"anchorIndexingFilter.deduplicate":"false",... }
   }

# curl
curl -X POST -H "Content-Type: application/json" http://localhost:8081/config/create -d '{"configId":"new-config", "params":{"anchorIndexingFilter.deduplicate":"false"}}' 

...

No Format
    job-id-43243

Seed List creation

The /seed/create endpoint enables the user to create a seedlist and return the temporary path of the file created. This path should be passed to the url_dir parameter of the INJECT job.

No Format

POST /seed/create
{
curl -X POST -H 'Content-Type: application/json' -i http://localhost:8081/seed/create --data '{"name":"name-of-seedlisttest", 
"seedUrls":["httphttps://wwwnutch.exampleapache.comorg",....]
}
' 

Response is the relative file directory path. Note, this is relative to where the Nutch server was started.

No Format

/var/folders/m9/hsls1krx12x968plt2brlhr00000gn/T/1443721976324-0
seedFiles/seed-1641959745623 

Database

This point provides access to information stored in the CrawlDb.

...