THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- Modularization – We've modularized tika-server:
tika-server-core
includes all of the functionality oftika-server
, but with no bundled parsers. Users might want this if they are only parsing a few file formats or want to use only their custom parsers.tika-server-standard
is what most people will want to use. As with thetika-parsers-standard
module, this includes most of the common file format parsers. If needed, users may also add thetika-parser-scientific-package
andtika-parser-sqlite3-package
to the class path. In 1.x, the first was included in tika-server 1.x by default, and the second was included only if users added xerial's sqlite3 jar on the classpath.
--spawnChild
mode is now default. In Tika 1.x, users had to specify this on the commandline to forcetika-server
to fork a process that did the actual parsing. This option is far more robust against timeouts, OOMs, crashes and other mishaps; the forking process monitors the forked process and will restart on timeouts, etc. NOTE: Client code needs to be able to handle the times whentika-server
is restarting and is not available; this typically only takes a few seconds. To disable this mode, use--noFork
on the commandline.- Configuring
tika-server
in Tika 2.x. See below. We've moved most configuration options intotika-config.xml
and dramatically limited the commandline options. - The namespace has changed slightly for
TikaServerCli
toorg.apache.tika.server.core.TikaServerCli
. If adding optional jars to the class path in, say, abin/
directory, start tika-server with:java -cp "bin/*" org.apache.tika.server.core.TikaServerCli -c tika-config.xml
enableFileUrl
-- We have removed this capability from tika-server in 2.x. We have replaced it with the FileSystemFetcher, which is available in tika-core. See FetchersInClassicServerEndpoints.
...
As with other components, in Tika 2.x, we moved configuration into tika-config.xml
. We have left only a few commandline options available (to see the options: java -jar tika-server-standard-2.x.x.jar --help
). Please note that all command-line option values will override their counterparts in the xml configuration file.
- -h, --host – hostname
- -p, --port – which port to bind to. Can specify ranges, e.g.
9990-9999
, and Tika will launch 10 servers in forked processes on each of those ports. Can also specify a comma-delimited list, e.g. (9996,9998,9999
). - -?, --help
- -c, --config – specify the tika-config.xml file to use for this tika-server and its forked processes.
- -i, --id – specify the id for this server. This is used in logging and in the
/status
endpoint. - --noFork – run tika-server in legacy mode without forking a process.
...