Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

 


Introduction

The role of a metadata extractor is extract metadata from one or more product types. In order to extract metadata, the extractor must understand the product type format, parse the product, and return metadata to be associated with the product. CAS-Curator, for example, uses metadata extractors to generate metadata for products in its staging area, both as a preview to the curator, and also during the course of data ingestion.

Java API

The CAS-Metadata project contains an interface class, org.apache.oodt.cas.metadata.MetExtractor. This API consists of two primary methods (with multiple method signatures each). This API can be seen below:

public interface MetExtractor {

    public Metadata extractMetadata(File f) 
            throws MetExtractionException;

    public Metadata extractMetadata(String filePath)
            throws MetExtractionException;

    public Metadata extractMetadata(URL fileUrl) 
            throws MetExtractionException;

    public Metadata extractMetadata(File f, File 
            configFile) throws MetExtractionException;

    public Metadata extractMetadata(File f, String 
            configFilePath) throws MetExtractionException;

    public Metadata extractMetadata(File f, 
            MetExtractorConfig config) 
            throws MetExtractionException;

    public Metadata extractMetadata(URL fileUrl, 
            MetExtractorConfig config) 
            throws MetExtractionException;
            
    public void setConfigFile(File f) 
            throws MetExtractionException;

    public void setConfigFile(String filePath) 
            throws MetExtractionException;

    public void setConfigFile(MetExtractorConfig config);
}

In order to implement a new extractor, a developer may implement the MetExtractor interface, or develop a metadata extractor that adheres to this interface in the development language of choice.

...