You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Given the following scenario:

  • FileManager is running on machineA.
  • Workflow/PGE tasks should run on machineB.

There are 2 strategies to operate on remote files:

  1. Use NFS to simulate a local filesystem on machineB
  2. Use fmprod to download the product to machineB from a web URL

When your Workflow/PGE task is done, you can ingest the output files into the remote FileManager using org.apache.oodt.cas.filemgr.datatransfer.RemoteDataTransferFactory as the clientTransferer.

Use NFS

The idea here is to use NFS to mount the data archive onto a common root file path. Then all file paths will appear to be on the local file system.

  1. Edit /etc/exports to allow machineB to see the data archive on machineA (for more information, type "man exports")
    /etc/exports
    /Users/me/filemgr/data/archive machineB(rw,sync)
    
  2. Mount the remote file system on both machineA and machineB (for more information, type "man mount")
    mkdir -p /net/machineA
    mount -t nfs machineA:/Users/me/filemgr/data/archive /net/machineA
    
  3. Edit product-types.xml to use the NFS mount
    product-types.xml
    ...
    <type id="urn:MyProdTypeId" name="MyProdTypeName">
      <repository path="file:///net/machineA"/>
    ...
    
  4. The FileManager should now return file paths that are reachable by machineB.

If your products were already previously ingested, there are several ways to update the catalog with the new NFS file location:

  • Re-ingest, making sure the repository path in product-types.xml uses the NFS mount.
  • Re-ingest, using a different Versioner, which gives the NFS mount as final file location.
  • MetadataBasedProductMover. Run the following locally on machineA:
    java -Djava.ext.dirs=../lib org.apache.oodt.cas.filemgr.tools.MetadataBasedProductMover \
        --fileManagerUrl http://localhost:9000 \
        --typeName MyProdTypeName \
        --pathSpec /net/machineA/[Filename]
    

Use fmprod

The idea here is to deploy a Product Server to allow anyone to download products with an HTTP request.

  1. Build fmprod (assumes you have OODT sources checked out)
    cd webapp/fmprod
    mvn clean package
    
  2. Deploy fmprod (cas-product-VERSION.war)
  3. Download the product with an HTTP request (GET or POST)
    • To get a single product, use the "productID" query parameter.
      curl 'http://webappMachine/fmprod?productID=< your product id >'
      
    • To get a dataset (products of a certain type) as a zip file, use the "typeID" query parameter.
      curl 'http://webappMachine/fmprod?typeID=< your product type id >'
      
  4. The previous download step can be wrapped in a "FileStager" task. Subsequent tasks can now operate on the downloaded local file.

The disadvantage of this approach is that Workflow/PGE tasks are not directly connected to the FileManager: you cannot access the product's metadata (without somehow explicitly downloading it first), you cannot use PGE sql-like queries, etc...

  • No labels