...
The role of a metadata extractor is extract metadata from one or more product types. In order to extract metadata, the extractor must understand the product type format, parse the product, and return metadata to be associated with the product. CAS-Curator, for example, uses metadata extractors to generate metadata for products in its staging area, both as a preview to the curator, and also during the course of data ingestion.
The CAS-Metadata project contains an interface class,
org org.apache.oodt.cas.metadata.MetExtractor. This API consists of two primary methods (with multiple method signatures each). This API can be seen below:
Code Block |
---|
title | org.apache.oodt.cas.metadata.MetExtractor.java |
---|
collapse | true |
---|
|
/**
* @author mattmann
* @version $Revision$
*
* <p>
* An interface for {@link Metadata} extraction. This interface expects the
* definition of the following two parameters:
*
* <ul>
* <li><b>file</b> - the file to extract {@link Metadata} from.</li>
* <li><b>config file</b> - a pointer to the config file for this MetExtractor</li>
* </ul>
* </p>
*
*/
public interface MetExtractor {
/**
* Extracts {@link Metadata} from a given {@link File}.
*
* @param f
* File object to extract Metadata from.
* @return Extracted {@link Metadata} from the given {@link File}.
* @throws MetExtractionException
|
public * If any error occurs.
*/
Metadata extractMetadata(File f) throws MetExtractionException;
/**
* Extracts {@link Metadata} from a given <code>/path/to/some/file</code>.
|
throws MetExtractionException;
public*
* @param filePath
* Path to a given file to extract Metadata from.
* @return Extracted {@link Metadata} from the given <code>filePath</code>.
* @throws MetExtractionException
* If any error occurs.
*/
Metadata extractMetadata(String filePath)
throws MetExtractionException;
/**
* Extracts {@link Metadata} from a given {@link URL} pointer to a
* {@link File}.
|
public *
* @param fileUrl
* The URL pointer to a File.
* @return Extracted {@link Metadata} from the given File {@link URL}.
* @throws MetExtractionException
* If any error occurs.
*/
Metadata extractMetadata(URL fileUrl) throws MetExtractionException;
/**
* Sets the config file for this MetExtractor to the specified {@link File}
* <code>f</code>.
*
* @param f
* The config file for this MetExtractor.
* @throws MetExtractionException
*/
void setConfigFile(File f) throws MetExtractionException;
|
public Metadata extractMetadata(File f, File
/**
* Sets the config file for this MetExtractor to the specified {@link File}
* identified by <code>filePath</code>.
*
* @param filePath
* The config file path for this MetExtractor.
* @throws MetExtractionException
*/
void setConfigFile(String filePath) throws MetExtractionException;
/**
* Sets the MetExtractorConfig for the MetExtractor
*
* @param config
* The MetExtractorConfig
*/
void setConfigFile(MetExtractorConfig config);
/**
* Extracts {@link Metadata} from the given {@link File} using the specified
* config file.
*
* @param f
* The File to extract Metadata from.
* @param configFile
* The config file for this MetExtractor.
* @return Extracted {@link Metadata} from the given {@link File} using the
* |
configFile) throws MetExtractionException;
public specified config file.
* @throws MetExtractionException
* If any error occurs.
*/
Metadata extractMetadata(File f, |
String configFilePath) throws MetExtractionException;
/**
|
public* Extracts {@link Metadata |
extractMetadata(File f, } from the given {@link File} using the specified
* config file path.
*
* @param f
* The File to extract Metadata from.
* @param configFilePath
* |
MetExtractorConfig config)
The path to the config file for this MetExtractor.
* @return Extracted {@link Metadata} from the given {@link File} using the
* |
throwsspecified config file path.
* @throws MetExtractionException |
;
* If any error occurs.
*/
|
public Metadata extractMetadata( |
URLfileUrlf, String configFilePath)
throws MetExtractionException;
/**
* Extracts {@link Metadata} from the given {@link File} using the specified
* {@link MetExtractorConfig}.
|
config)
*
* @param f
* The {@link File} from which {@link Metadata} will be extracted
* from
* @param config
* The config file for the extractor
* @return {@link Metadata} extracted from the {@link File}
* |
throws@throws MetExtractionException |
;publicvoidsetConfigFileextractMetadata(File f, MetExtractorConfig config) |
throws MetExtractionException;
/**
* Extracts {@link Metadata} from the given {@link URL} using the specified
* {@link MetExtractorConfig}.
*
|
public void setConfigFile(String filePath)
* @param fileUrl
* The {@link URL} from which {@link Metadata} will be extracted
* from
* @param config
* The config file for the extractor
* @return {@link Metadata} extracted from the {@link URL}
* |
throws@throws MetExtractionException |
;
publicvoidsetConfigFileextractMetadata(URL fileUrl, MetExtractorConfig config)
throws MetExtractionException;
} |
In order to implement a new extractor, a developer may implement the MetExtractor interface, or develop a metadata extractor that adheres to this interface in the development language of choice.