Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Start opensearch via Docker:
    1. docker pull opensearchproject/opensearch:1.2.4
    2. docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
  2. Curl schema to opensearch: 

    curl -k -T configs/opensearch/opensearch-parent-child-mappings.json -u admin:admin -H "Content-Type:application/json" https://localhost:9200/tika-test-parent-child

  3. Configure the basePath element in FileSystemPipesIterator and FileSystemFetcher in configs/opensearch/tika-config-fs-to-opensearch-parent-child.xml

  4. java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/opensearch/tika-config-fs-to-opensearch-parent-child.xml

...

B) Solr Parent-Child Example (fileshare to

...

Solr)

  1. Start opensearch via Docker:
    1. docker pull opensearchproject/opensearch:1.2.4
    2. docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
  2. Curl schema to opensearch: 

    curl -k -T configs/opensearch/opensearch-indiv-files-mappings.json -u admin:admin -H "Content-Type:application/json" https://localhost:9200/tika-test-indiv-files

  3. Configure the basePath element in FileSystemPipesIterator and FileSystemFetcher in configs/opensearch/tika-config-fs-to-opensearch-indiv-files.xml

  4. java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/opensearch/tika-config-fs-to-opensearch-indiv-files.xml

D) OpenSearch/Elasticsearch Legacy Example (fileshare to OpenSearch/ElasticSearch)

  1. From the solr directory
    1. bin/solr start
    2. bin/solr create -c tika-example-parent-child && bin/solr config -c tika-example-parent-child -p 8983 -action set-user-property -property update.autoCreateFields -value false
  2. From the tika-pipes-tutorial directory

    1. Set the schema in Solr: curl -F 'data=@configs/solr/solr-parent-child-schema.json' http://localhost:8983/solr/tika-example-parent-child/schema

    2. Configure the basePath element in FileSystemPipesIterator and FileSystemFetcher in configs/solr/tika-config-solr.xml

      Code Block
      languagexml
      titleFileSystemPipesIterator
      collapsetrue
          <fetcher class="org.apache.tika.pipes.fetcher.fs.FileSystemFetcher">
            <params>
              <name>fsf</name>
              <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath>
            </params>
          </fetcher>
      ...
        <pipesIterator class="org.apache.tika.pipes.pipesiterator.fs.FileSystemPipesIterator">
          <params>
            <fetcherName>fsf</fetcherName>
            <emitterName>solr1</emitterName>
            <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath>
          </params>
        </pipesIterator>


    3. java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/solr/tika-config-solr-parent-child.xml

C) OpenSearch/Elasticsearch Individual Files Example (fileshare to OpenSearch/ElasticSearch)

  1. Start opensearch via Docker:
    1. docker pull opensearchproject/opensearch:1.2.4
    2. docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
  2. Curl schema to opensearch: 

    curl -k -T configs/opensearch/opensearch-indiv-files-mappings.json -u admin:admin -H "Content-Type:application/json" https://localhost:9200/tika-test-indiv-files

  3. Configure the basePath element in FileSystemPipesIterator and FileSystemFetcher in configs/opensearch/tika-config-fs-to-opensearch-indiv-files.xml

  4. java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/opensearch/tika-config-fs-to-opensearch-indiv-files.xml

C) Solr Indiv Files Example (fileshare to Solr)

  1. From the solr directory
    1. bin/solr start
    2. bin/solr create -c tika-example-indiv-files && bin/solr config -c tika-example-indiv-files -p 8983 -action set-user-property -property update.autoCreateFields -value false
  2. From the tika-pipes-tutorial directory

    1. Set the schema in Solr: curl -F 'data=@configs/solr/solr-indiv-files-schema.json' http://localhost:8983/solr/tika-example-indiv-files/schema

    2. Configure the basePath element in FileSystemPipesIterator and FileSystemFetcher in configs/solr/tika-config-solr-indiv-files.xml

      Code Block
      languagexml
      titleFileSystemPipesIterator
      collapsetrue
          <fetcher class="org.apache.tika.pipes.fetcher.fs.FileSystemFetcher">
            <params>
              <name>fsf</name>
              <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath>
            </params>
          </fetcher>
      ...
        <pipesIterator class="org.apache.tika.pipes.pipesiterator.fs.FileSystemPipesIterator">
          <params>
            <fetcherName>fsf</fetcherName>
            <emitterName>solr1</emitterName>
            <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath>
          </params>
        </pipesIterator>


    3. java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/solr/tika-config-solr-indiv-files.xml

D) OpenSearch/Elasticsearch Legacy Example (fileshare to OpenSearch/ElasticSearch)

  1. Start opensearch via Docker:
    1. docker pull opensearchproject/opensearch:1.2.4
    2. docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
  2. Curl schema to opensearch: curl -k -T configs/opensearch/opensearch-legacy-mappings.json -u admin:admin -H "Content-Type:application/json" https://localhost:9200/tika-test-legacy


  3. Configure the basePath element in FileSystemPipesIterator and FileSystemFetcher in configs/opensearch/tika-config-fs-to-opensearch-legacy.xml

  4. java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/opensearch/tika-config-fs-to-opensearch-legacy.xml


D) Solr Legacy Example (fileshare to Solr)

  1. From the solr directory
    1. bin/solr start
    2. bin/solr create -c tika-example-legacy && bin/solr config -c tika-example-legacy -p 8983 -action set-user-property -property update.autoCreateFields -value false
  2. From the tika-pipes-tutorial directory

    1. Set the schema in Solr: curl -F 'data=@configs/solr/solr-legacy-schema.json' http://localhost:8983/solr/tika-example-legacy/schema

    2. Configure the basePath element in FileSystemPipesIterator and FileSystemFetcher in configs/solr/tika-config-solr-legacy.xml

      Code Block
      languagexml
      titleFileSystemPipesIterator
      collapsetrue
          <fetcher class="org.apache.tika.pipes.fetcher.fs.FileSystemFetcher">
            <params>
              <name>fsf</name>
              <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath>
            </params>
          </fetcher>
      ...
        <pipesIterator class="org.apache.tika.pipes.pipesiterator.fs.FileSystemPipesIterator">
          <params>
            <fetcherName>fsf</fetcherName>
            <emitterName>solr1</emitterName>
            <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath>
          </params>
        </pipesIterator>
  3. Start opensearch via Docker:
    1. docker pull opensearchproject/opensearch:1.2.4
    2. docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
  4. Curl schema to opensearch: curl -k -T configs/opensearch/opensearch-legacy-mappings.json -u admin:admin -H "Content-Type:application/json" https://localhost:9200/tika-test-legacy
  5. Configure the basePath element in FileSystemPipesIterator and FileSystemFetcher in configs/opensearch/tika-config-fs-to-opensearch-legacy.xml

    1. java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/
    opensearch
    1. solr/tika-config-
    fs-to-opensearch
    1. solr-legacy.xml