Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Limit the total number of bytes written to a metadata objects (prevent DoS from files with large amounts of metadata)
  2. Limit the fields written to a metadata object (decrease bytes held in memory during the parse and decrease the bytes sent over the wire/written to a file after the parse)

To configure the StandardWriteFilter, set the properties in its factory in the <autoDetectParserConfig> element in the tika-config.xml file:


Code Block
languagexml
titleStandardWriteFilter
<?xml version="1.0" encoding="UTF-8"?>
<properties>
  <autoDetectParserConfig>
    <metadataWriteFilterFactory class="org.apache.tika.metadata.writefilter.StandardWriteFilterFactory">
      <params>
        <maxKeySize>999</maxKeySize>
        <maxFieldSize>10001</maxFieldSize>
        <maxTotalEstimatedBytes>100000</maxTotalEstimatedBytes>
        <maxValuesPerField>100</maxValuesPerField>
        <includeFields>
          <field>dc:creator</field>
          <field>dc:title</field>
        </includeFields>
      </params>
    </metadataWriteFilterFactory>
  </autoDetectParserConfig>
</properties>