...
Also, please be polite. This feature was added as a convenience. Please consider using a robust crawler (instead of our simple TikaInputStream.get(new URL(fileUrl))
) that will allow for better configuration of redirects, timeouts, cookies, etc.; and a robust crawler will respect robots.txt!
Transfer-Layer Compression
As of Tika 1.24.1, users can turn on gzip
compression for either files on their way to tika-server
or the output from tika-server
.
If you want to gzip
your files before sending to tika-server
, add
No Format |
---|
curl -T test_my_doc.pdf -H "Content-Encoding: gzip" http://localhost:9998/rmeta |
If you want tika-server
to compress the output of the parse:
No Format |
---|
curl -T test_my_doc.pdf -H "Accept-Encoding: gzip" http://localhost:9998/rmeta |
Making Tika Server Robust to OOMs, Infinite Loops and Memory Leaks
...