...
List all the available parsers, along with what mimetypes they support
Specifying a URL Instead of Putting Bytes in Tika 2.x
In Tika 2.x, use a FileSystemFetcher, a UrlFetcher or or an HttpFetcher. See: tika-pipes (FetchersInClassicServerEndpoints)
We have entirely removed the -enableFileUrl
capability that we had in 1.x.
Specifying a URL Instead of Putting Bytes in Tika 1.x
In Tika 1.10, we removed this capability because it posed a security vulnerability (CVE-2015-3271). Anyone with access to the service had the server's access rights; someone could request local files via file:///
or pages from an intranet that they might not otherwise have access to.
...
Also, please be polite. This feature was added as a convenience. Please consider using a robust crawler (instead of our simple TikaInputStream.get(new URL(fileUrl))
) that will allow for better configuration of redirects, timeouts, cookies, etc.; and a robust crawler will respect robots.txt!NOTE: In Tika 2.x, this capability has been replaced by a FileSystemFetcher. See: tika-pipes (FetchersInClassicServerEndpoints)
Transfer-Layer Compression
...