THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
To configure the StandardWriteFilter, set the properties in its factory in the <autoDetectParserConfig>
element in the tika-config.xml
file:.
Code Block | ||||
---|---|---|---|---|
| ||||
<?xml version="1.0" encoding="UTF-8"?> <properties> <autoDetectParserConfig> <metadataWriteFilterFactory class="org.apache.tika.metadata.writefilter.StandardWriteFilterFactory"> <params> <!-- all measurements are in UTF-16 bytes. If any values are truncated, TikaCoreProperties.TRUNCATED_METADATA is set to true in the metadata object --> <!-- the maximum size for a metadata key. Keys <maxKeySize>999<will be truncated to this length if > this value --> <maxKeySize>1000</maxKeySize> <!-- max total <maxFieldSize>10001</maxFieldSize>size for a field in UTF-16 bytes. If a field has multiple values, their lengths are summed to calculate the field size. --> <maxFieldSize>10000</maxFieldSize> <!-- max total estimated byte is a sum of the key sizes and values --> <maxTotalEstimatedBytes>100000</maxTotalEstimatedBytes> <!-- limit the count of values for multi-valued fields --> <maxValuesPerField>100</maxValuesPerField> <!-- include only these fields. NOTE, however that there a several fields that are important to the parse process and these fields are always allowed in addition (see ALWAYS_SET_FIELDS and ALWAYS_ADD_FIELDS in the StandardWriteFilter --> <includeFields> <field>dc:creator</field> <field>dc:title</field> </includeFields> </params> </metadataWriteFilterFactory> </autoDetectParserConfig> </properties> |
...