Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Use fetcher in traditional /tika /rmeta endpoints
    1. update configs/tika-config-basic.xml <basePath> element to get the full path to tika-pipes-tutorial-20221202/docs:  

      Code Block
      languagexml
      titleFileSystemFetcher
      collapsetrue
        <fetcher class="org.apache.tika.pipes.fetcher.fs.FileSystemFetcher">
          <params>
            <name>fsf</name>
            <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20221202/docs</basePath>
          </params>
        </fetcher>


    2. start the server: java -cp "server-bin/*" org.apache.tika.server.core.TikaServerCli -c configs/tika-config-basic.xml
    3. curl -X PUT

       

      http://localhost:9998/rmeta

       

      -H "fetcherName:fsf" -H


      "fetchKey:testPDF.pdf" | jq --sort-keys

  2. Use /pipes handler to read from and write to a local file share
    1. update configs/tika-config-basic.xml <basePath> element to get the full path to tika-pipes-tutorial-20221202/docs:  

      Code Block
      languagexml
      titleFileSystemEmitter
      collapsetrue
        <emitters>
          <emitter class="org.apache.tika.pipes.emitter.fs.FileSystemEmitter">
            <params>
              <name>fse</name>
              <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20221202/extracts</basePath>
            </params>
          </emitter>
        </emitters>


    2. start the server: java -cp "server-bin/*" org.apache.tika.server.core.TikaServerCli -c configs/tika-config-basic.xml
    3. commandline TBD
  3. Configure metadata handler and rerun 2.
  4. Use /async handler file share to file share
  5. Configure Solr/OpenSearch/ElasticSearch emitter and run /pipes handler
  6. Run the async processor via tika-app

...