You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

1) Hands-on tika-eval module workshop, Part 1

November 9, 2021, Tuesday 11am EST/4pm UTC

The dial-in information is available to those who register via Meetup.

This workshop is designed for hands-on tech folks who can run Tika from the commandline or can curl to a local tika-server.

Stay tuned for prerequisites, resources and an agenda!

The following is all a work in progress.  Please check back right before the workshop!

Prerequisites:
  1. java >= 8
  2. tika-eval app and tika-app jars: https://dlcdn.apache.org/tika/2.1.0/tika-eval-app-2.1.0.jar and https://dlcdn.apache.org/tika/2.1.0/tika-app-2.1.0.jar
  3. JSON editor/viewer (jq should be sufficient. I like Sublime with the PrettyJSON plugin https://github.com/dzhibas/SublimePrettyJson)
  4. XLSX viewer (Excel or Open/LibreOffice)
Optional materials:
  1. tika-server-standard jar: https://dlcdn.apache.org/tika/2.1.0/tika-server-standard-2.1.0.jar
  2. tika-eval-core.jar: https://repo1.maven.org/maven2/org/apache/tika/tika-eval-core/2.1.0/tika-eval-core-2.1.0.jar
  3. Some knowledge of SQL
Example docs, extracts and config files: tika-eval-workshop-20211109.tgz

Before the class, you should unzip the tika-eval-workshop-20211109.tgz (tar -xzvf tika-eval-workshop-20211109.tgz) and run tika-app on the docs directory: java -jar tika-app-2.1.0.jar -J -t -i docs -o extracts/my_extracts 

  • No labels