1. write your python met extractor and get it producing the right cas met that you want
2. write up file manager policy by editing the elements.xml
file and then mapping them to the HDF5 product type in product-types.xml
3. then start working on writing an extern-config.xml
file that will wrap your python extractor as a CAS met extractor
once you get through 1-3 at that point you can ingest
the best way to do that is
4. call the CmdLineIngester program
(if you are doing it interactively)
and file by file from te command line to test
to do it automatically
you would simply call the MetExtractorProductCrawler
(which just uses CmdLineIngester underneath)
for #1, it's best to check out http://oodt.apache.org/components/maven/metadata/user/basic.html
(that will help you with #3 too)
for #2, the best example would be to look at the existing sample elements.xml and product-type-element-map.xml
that comes with filemgr in src/main/resources/examples (here http://svn.apache.org/repos/asf/oodt/trunk/filemgr/src/main/resources/examples )
to see how to add more elements and how to map them to a product type
for #4, we have a crawler guide here, http://oodt.apache.org/components/maven/crawler/user/index.html
(and for the CmdLineIngester, the javadoc is as best we have at this point, check it out here http://oodt.apache.org/components/maven/apidocs/org/apache/oodt/cas/filemgr/ingest/CmdLineIngester.html
and StdIngester (which it subclasses) http://oodt.apache.org/components/maven/apidocs/org/apache/oodt/cas/filemgr/ingest/StdIngester.html

  • No labels