THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- Tika Website
- Download latest Tika Release
- Tika mailing lists: Sign-up
- Tika Support
- TikaResources - Articles, books, podcasts, etc. on using Tika
- Troubleshooting Tika
- 3rd party parser plugins
- DeveloperResources
- Using Tika Server - How to deploy Tika as a RESTful Service.
- API Bindings for Tika
- Logging
...
- UsingGit - Information on Tika's configuration management using Git.
- Release Process - Info on releasing Tika
- ThirdPartySonaType - A guide to staging and deploying third party jars on Sonatype OSSRH (OSS Repository Hosting) for subsequent use within Tika parser wrappers
- VirtualMachine - a virtual machine hosted by Rackspace that allows an instance of Tika Server to run for public testing. Set up by Tim Allison et al.
User Notes
- Using Tika Server - How to deploy Tika as a RESTful Service.
- ModifyingContentWithHandlersAndMetadataFilters How to configure limits and modify parsed content with the AutoDetectParserConfig, custom ContentHandlers, metadata filters and metadata write filters.
- API Bindings for Tika - Using Tika from additional languages and frameworks.
- PostingManyFilesToExtractingRequestHandler - How to post many files to the Extracting Request Handler (Tika) in Solr.
- IntegratingTikaWithExtractingRequestHandler - Building the latest Tika and integrating it with the Extracting Request Handler (Tika) in Solr.
- Some stats using Tesseract OCR - some stats from a contributing team (Hyperion Gray) about using TesseractOCR (will be updated with Tika).
- Troubleshooting Tika
- Notes on configuring parsing via the ParseContext
- Notes on Specific Parsers
- Notes on configuring Tika to extract embedded vba and js
- Using the tika-eval Module
- When does Tika need/create a File rather than an InputStream?
- How to Test Your Framework's Handling of Tika Behaving Badly
...
- Statistical Machine Translation with Apache Joshua (Incubating) - A guide for leveraging Apache Joshua for language translation via the Tika.translate API.
- Neural Machine Translation powered by Reader Translator Generator toolkit - A guide for RTG integration with Tika.translate API
Design
- MetadataDiscussion - discussions on the design of MIME type detection and parsing for recursive metadata formats (and container formats) like Zip, etc.
- RecursiveMetadata - proposals for dealing with recursive metadata, based on the MetadataDiscussion page.
- Tika JAX-RS Server - documentation on the recently contributed tika-server module.
- Metadata roadmap - Documentation and Discussion about the metadata roadmap for Tika
- Errors and Exceptions - What parsers should output/throw when, for empty/invalid/unsupported files
- Composite Parsers discussion - How to give users sensible+clear control of multiple parsers for a given file type
- Tika 2.0 discussion - Roadmap for changes we would like to make for Tika 2.0
- Tika 2.0 Migration Guide - Guide for migrating to Tika 2.0 (once it is available)
Meetings and Tutorials
Regression Testing On the Rackspace VM
...