...
- Start opensearch via Docker:
- docker pull opensearchproject/opensearch:1.2.4
- docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
- Curl schema to opensearch:
curl -k -T configs/opensearch/opensearch-parent-child-mappings.json -u admin:admin -H "Content-Type:application/json" https://localhost:9200/tika-test-parent-child
Configure the
basePath
element inFileSystemPipesIterator
andFileSystemFetcher in
configs/opensearch/tika-config-fs-to-opensearch-parent-child.xml
java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/opensearch/tika-config-fs-to-opensearch-parent-child.xml
...
B) Solr Parent-Child Example (fileshare to
...
Solr)
- Start opensearch via Docker:
- docker pull opensearchproject/opensearch:1.2.4
- docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
- Curl schema to opensearch:
curl -k -T configs/opensearch/opensearch-indiv-files-mappings.json -u admin:admin -H "Content-Type:application/json" https://localhost:9200/tika-test-indiv-files
Configure the
basePath
element inFileSystemPipesIterator
andFileSystemFetcher in
configs/opensearch/tika-config-fs-to-opensearch-indiv-files.xml
java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/opensearch/tika-config-fs-to-opensearch-indiv-files.xml
D) OpenSearch/Elasticsearch Legacy Example (fileshare to OpenSearch/ElasticSearch)
- From the solr directory
bin/solr start
bin/solr create -c tika-example-parent-child && bin/solr config -c tika-example-parent-child -p 8983 -action set-user-property -property update.autoCreateFields -value false
From the tika-pipes-tutorial directory
Set the schema in Solr:
curl -F 'data=@configs/solr/solr-parent-child-schema.json' http://localhost:8983/solr/tika-example-parent-child/schema
Configure the
basePath
element inFileSystemPipesIterator
andFileSystemFetcher
inconfigs/solr/tika-config-solr.xml
Code Block language xml title FileSystemPipesIterator collapse true <fetcher class="org.apache.tika.pipes.fetcher.fs.FileSystemFetcher"> <params> <name>fsf</name> <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath> </params> </fetcher> ... <pipesIterator class="org.apache.tika.pipes.pipesiterator.fs.FileSystemPipesIterator"> <params> <fetcherName>fsf</fetcherName> <emitterName>solr1</emitterName> <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath> </params> </pipesIterator>
java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/solr/tika-config-solr-parent-child.xml
C) OpenSearch/Elasticsearch Individual Files Example (fileshare to OpenSearch/ElasticSearch)
- Start opensearch via Docker:
- docker pull opensearchproject/opensearch:1.2.4
- docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
- Curl schema to opensearch:
curl -k -T configs/opensearch/opensearch-indiv-files-mappings.json -u admin:admin -H "Content-Type:application/json" https://localhost:9200/tika-test-indiv-files
Configure the
basePath
element inFileSystemPipesIterator
andFileSystemFetcher in
configs/opensearch/tika-config-fs-to-opensearch-indiv-files.xml
java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/opensearch/tika-config-fs-to-opensearch-indiv-files.xml
C) Solr Indiv Files Example (fileshare to Solr)
- From the solr directory
bin/solr start
bin/solr create -c tika-example-indiv-files && bin/solr config -c tika-example-indiv-files -p 8983 -action set-user-property -property update.autoCreateFields -value false
From the tika-pipes-tutorial directory
Set the schema in Solr:
curl -F 'data=@configs/solr/solr-indiv-files-schema.json' http://localhost:8983/solr/tika-example-indiv-files/schema
Configure the
basePath
element inFileSystemPipesIterator
andFileSystemFetcher
inconfigs/solr/tika-config-solr-indiv-files.xml
Code Block language xml title FileSystemPipesIterator collapse true <fetcher class="org.apache.tika.pipes.fetcher.fs.FileSystemFetcher"> <params> <name>fsf</name> <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath> </params> </fetcher> ... <pipesIterator class="org.apache.tika.pipes.pipesiterator.fs.FileSystemPipesIterator"> <params> <fetcherName>fsf</fetcherName> <emitterName>solr1</emitterName> <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath> </params> </pipesIterator>
java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/solr/tika-config-solr-indiv-files.xml
D) OpenSearch/Elasticsearch Legacy Example (fileshare to OpenSearch/ElasticSearch)
- Start opensearch via Docker:
- docker pull opensearchproject/opensearch:1.2.4
- docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
- Curl schema to opensearch:
curl -k -T configs/opensearch/opensearch-legacy-mappings.json -u admin:admin -H "Content-Type:application/json" https://localhost:9200/tika-test-legacy
Configure the
basePath
element inFileSystemPipesIterator
andFileSystemFetcher in
configs/opensearch/tika-config-fs-to-opensearch-legacy.xml
java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/opensearch/tika-config-fs-to-opensearch-legacy.xml
D) Solr Legacy Example (fileshare to Solr)
- From the solr directory
bin/solr start
bin/solr create -c tika-example-legacy && bin/solr config -c tika-example-legacy -p 8983 -action set-user-property -property update.autoCreateFields -value false
From the tika-pipes-tutorial directory
Set the schema in Solr:
curl -F 'data=@configs/solr/solr-legacy-schema.json' http://localhost:8983/solr/tika-example-legacy/schema
Configure the
basePath
element inFileSystemPipesIterator
andFileSystemFetcher
inconfigs/solr/tika-config-solr-legacy.xml
Code Block language xml title FileSystemPipesIterator collapse true <fetcher class="org.apache.tika.pipes.fetcher.fs.FileSystemFetcher"> <params> <name>fsf</name> <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath> </params> </fetcher> ... <pipesIterator class="org.apache.tika.pipes.pipesiterator.fs.FileSystemPipesIterator"> <params> <fetcherName>fsf</fetcherName> <emitterName>solr1</emitterName> <basePath>/Users/allison/Desktop/tika-pipes-tutorial-20220124/docs</basePath> </params> </pipesIterator>
- Start opensearch via Docker:
- docker pull opensearchproject/opensearch:1.2.4
- docker run -p 9200:9200 -p 9600:9600 -e "discovery.type=single-node" opensearchproject/opensearch:1.2.4
Curl schema to opensearch: - Configure the
basePath
element inFileSystemPipesIterator
andFileSystemFetcher in
configs/opensearch/tika-config-fs-to-opensearch-legacy.xml
java -cp "app-bin/*" org.apache.tika.cli.TikaCLI -a --config=configs/
solr/tika-config-
solr-legacy.xml
curl -k -T configs/opensearch/opensearch-legacy-mappings.json -u admin:admin -H "Content-Type:application/json" https://localhost:9200/tika-test-legacy