Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

1 If the parse exception comes early in the parse before the streaming starts (as with an EncryptedDocumentException), you'll get an http status 422 in /tika (text) and /tika (html).  With the /tika (text)  option, if the parse exception happens after content has started streaming, the stream will simply stop and you'll have no idea that there was a parse exception.  With the /tika (html)  option, you'll see truncated html in /tika (html) if this happens.

2 Tika tries to stream in processing while parsing and in while writing the output.  For some file formats, the parsers currently load the full document into memory and then write the content.  So, this row focuses on whether Tika streams the writing of the content (and not the streaming read/parse of the file).