Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. java >= 8
  2. curl (or postman or something similar)
  3. create a working directory, e.g. tika-pipes-tutorial
  4. In tika-pipes-tutorial/app-bin/:
    1. https://dlcdn.apache.org/tika/2.1.0/tika-app-2.1.0.jar
    2. https://repo1.maven.org/maven2/org/apache/tika/tika-emitter-fs/2.1.0/tika-emitter-fs-2.1.0.jar
    3. https://repo1.maven.org/maven2/org/apache/tika/tika-emitter-solr/2.1.0/tika-emitter-solr-2.1.0.jar OR https://repo1.maven.org/maven2/org/apache/tika/tika-emitter-opensearch/2.1.0/tika-emitter-opensearch-2.1.0.jar
    4. tika-core-2.1.1-SNAPSHOT-test-jar-with-dependencies.jar
  5. In tika-pipes-tutorial/server-bin/:
    1. tika-server-standard jar: https://dlcdn.apache.org/tika/2.1.0/tika-server-standard-2.1.0.jar
    2. https://repo1.maven.org/maven2/org/apache/tika/tika-emitter-fs/2.1.0/tika-emitter-fs-2.1.0.jar
    3. https://repo1.maven.org/maven2/org/apache/tika/tika-emitter-solr/2.1.0/tika-emitter-solr-2.1.0.jar OR https://repo1.maven.org/maven2/org/apache/tika/tika-emitter-opensearch/2.1.0/tika-emitter-opensearch-2.1.0.jar
    4. tika-core-2.1.1-SNAPSHOT-test-jar-with-dependencies.jar
  6. Unzip configs.zip (to be supplied later today) here: tika-pipes-tutorial/configs
  7. Installation of Apache Solr (~8.9.x) and/or OpenSearch (~1.x) and/or Elasticsearch (7.x)

...

  1. Use fetcher in traditional /tika /rmeta endpoints
    1. start the server: java -cp "server-bin/*" org.apache.tika.server.core.TikaServerCli -c
      tika-config-basic.xml
    2. curl -X PUT http://localhost:9998/rmeta -H "fetcherName:fsf" -H
      "fetchKey:testPDF.pdf" | jq --sort-keys
  2. Use /pipes handler to read from and write to a local file share
  3. Configure metadata handler and rerun 2.
  4. Use /async handler file share to file share
  5. Configure Solr/OpenSearch/ElasticSearch emitter and run /pipes handler
  6. Run the async processor via tika-app

...