...
Here are the commands to install the cas-filemgr target from a tarfile. You will need to fit in the "..." with the appropriate content.
No Format | ||
---|---|---|
| ||
$ mkdir -p /usr/local/oodt/
$ tar xzvf .../filemgr/target/cas-filemgr-0.4-SNAPSHOT-dist.tar.gz -C /usr/local/oodt/
$ cd /usr/local/oodt/
$ ln -s cas-filemgr-0.4-SNAPSHOT/ cas-filemgr |
The decompressed tar file creates a directory structure that looks as follows:
No Format | ||
---|---|---|
| ||
.
├── bin
│ ├── convert_map
│ ├── filemgr
│ ├── filemgr-client
│ ├── migrate_xml_policy
│ └── query |
...
-tool ├── etc │ ├── filemgr.properties │ ├── logging.properties │ └── mime-types.xml ├── lib │ └── *.jar ├── logs │ ├── REMOVE.log └── policy ├── core │ ├── elements.xml │ ├── product-type-element-map.xml │ └── product-types.xml ├── geo │ ├── elements.xml │ ├── product-type-element-map.xml │ └── product-types.xml │ (additional policy subdirectories) |
Please note, if you are using version 0.3 of OODT or earlier, the policy directory will look like this (with no subdirectories):
No Format | ||
---|---|---|
|
...
└── policy
├── elements.xml
├── product-type-element-map.xml
└── product-types.xml
|
Here is a brief description of each directory that you see listed:
...
filemgr
: file manager (startup/shutdown) scriptfilemgr-client
: file manager client interface scriptquery_-tool
: catalog query toolconvert_map
: ???migrate_xml_policy
: ???
...
You're now ready to run the file manager!
No Format | ||
---|---|---|
| ||
$ cd /usr/local/oodt/cas-filemgr/bin
$ ./filemgr --help
Usage: ./filemgr {start|stop|restart|status}
$ ./filemgr start |
Whats going to happen?
The filemgr should be up and running, however, some WARNING messages will appear, complaining about configuration.
...
Code Block | ||
---|---|---|
| ||
org.apache.oodt.cas.filemgr.catalog.lucene.idxPath=/usr/local/oodt/cas-filemgr/catalog
org.apache.oodt.cas.filemgr.repositorymgr.dirs=file:///usr/local/oodt/cas-filemgr/policy/core
org.apache.oodt.cas.filemgr.validation.dirs=file:///usr/local/oodt/cas-filemgr/policy/core
org.apache.oodt.cas.filemgr.mime.type.repository=/usr/local/oodt/cas-filemgr/etc/mime-types.xml
|
...
Code Block | ||
---|---|---|
| ||
<repository path="file:///var/archive/data"/>
|
...
If you're feeling curious, check out the other xml files in the /usr/local/oodt/cas-filemgr/policy
subdirectories to get a better feel for how we define product types and elements. For a discussion of best practices w.r.t File Manager Policy, the reader is referred to Everything you want to know about File Manager Policy
A brief overview of filemgr-client and query
...
-tool
These commands are found in /usr/local/oodt/cas-filemgr/bin
.
...
In order to trigger a file ingestion we're going to use the filemgr-client
. This is by no means the most automated way to ingest data into an repository, however it's a really easy and intuitive way to trigger a file ingestion. The filemgr-client
is a wrapper script, making it easier to invoke a java executable from the command line.
No Format | ||
---|---|---|
| ||
$ cd /usr/local/oodt/cas-filemgr/bin
$ ./filemgr-client --help
filemgr-client --url <url to xml rpc service> --operation [<operation> [params]]
operations:
--addProductType --typeName <name> --typeDesc <description>
--repository <path> --versionClass <classname of versioning impl>
--ingestProduct --productName <name> --productStructure <Hierarchical|Flat>
--productTypeName <name of product type> --metadataFile <file>
[--clientTransfer --dataTransfer <java class name of data transfer factory>]
--refs <ref1>...<refn>
--hasProduct --productName <name>
--getProductTypeByName --productTypeName <name>
--getNumProducts --productTypeName <name>
--getFirstPage --productTypeName <name>
--getNextPage --productTypeName <name> --currentPageNum <number>
--getPrevPage --productTypeName <name> --currentPageNum <number>
--getLastPage --productTypeName <name>
--getCurrentTransfer
--getCurrentTransfers
--getProductPctTransferred --productId <id> --productTypeName <name>
--getFilePctTransferred --origRef <uri>
|
As you can see there's a number of different ways this command can be executed.
...
However, before we take a look at the --operation --ingestProduct
, I would first like to shed a bit more light on the query_-tool
command.
Command: query
...
-tool
This is a very useful wrapper script to query the content of your repository.
No Format | ||
---|---|---|
| ||
$ cd /usr/local/oodt/cas-filemgr/bin
$ ./query |
...
-tool Must specify a query and filemgr url! Usage: QueryTool [options] options: --url <fm url> Lucene like query options: --lucene -query <query> SQL like query options: --sql -query <query> -sortBy <metadata-key> -outputFormat <output-format-string> |
We see that we need to set some command line arguments to get anything useful out of the query tool. Try the next command:
$ ./query_-tool --url http://localhost:9000 --sql -query 'SELECT * FROM GenericFile'
...
Code Block | ||
---|---|---|
| ||
<cas:metadata xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas">
</cas:metadata>
|
...
To complete the process, lets see if we can retrieve the metadata. Run the query command again:
$ cd /usr/local/oodt/cas-filemgr/bin
$ ./query_-tool --url http://localhost:9000 --sql -query 'SELECT * FROM GenericFile'
...
At the time of writing this tutorial, composing queries using query_-tool is not entirely straight forward, but entirely usable. Formatting of these queries is critical, small deviations from the syntax can result in the query return an unexpected value or throwing an exception.
...
Here is a somewhat verbose example that uses all the SQL-like syntax that I am currently aware of (apologies for all the line breaks).
No Format | ||
---|---|---|
| ||
$ cd /usr/local/oodt/cas-filemgr/bin
$ ./query |
...
-tool --url http://localhost:9000 --sql \ -query "SELECT CAS.ProductReceivedTime,CAS.ProductName,CAS.ProductId,ProductType,\ ProductStructure,Filename,FileLocation,MimeType \ FROM GenericFile WHERE Filename='blah.txt'" -sortBy 'CAS.ProductReceivedTime' \ -outputFormat '$CAS.ProductReceivedTime,$CAS.ProductName,$CAS.ProductId,$ProductType,\ $ProductStructure,$Filename,$FileLocation,$MimeType' |
The output should look like:
2011-10-07T10:59:12.031+02:00,blah.txt,a00616c6-f0c2-11e0-baf4-65c684787732,
GenericFile,Flat,blah.txt,/var/kat/archive/data/blah.txt,text/plain
...