Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

Here we will try to explain what CAS-PGE is and when a user should consider using it.  

Special Reserved Metadata Keys within CAS-PGE

SUBVERSION REPO PATH

http://svn.apache.org/repos/asf/oodt/trunk/pge/src/main/java/org/apache/oodt/cas/pge/metadata/PgeTaskMetadataKeys.javaImage Removed

Code Block
titlePgeTaskMetadataKeys.java
borderStylesolid


/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */


package org.apache.oodt.cas.pge.metadata;

/**
 * 
 * @author bfoster
 * @version $Revision$
 *
 * <p>Describe your class here</p>.
 */
public interface PgeTaskMetadataKeys {

    public static final String NAME = "PGETask_Name";

    public static final String SCI_EXE_PATH = "PGETask_SciExe_Path";

    public static final String SCI_EXE_VERSION = "PGETask_SciExe_Version";

    public static final String PRODUCT_PATH = "PGETask_ProductPath";

    public static final String CONFIG_FILE_PATH = "PGETask_ConfigFilePath";
    
    public static final String LOG_FILE_PATTERN = "PGETask_LogFilePattern";

    public static final String PROPERTY_ADDER_CLASSPATH = "PGETask_PropertyAdderClasspath";

    public static final String PGE_RUNTIME = "PGETask_Runtime";
    
    /* PGE task statuses */
    public static final String STAGING_INPUT = "PGETask_Staging_Input";

    public static final String CONF_FILE_BUILD = "PGETask_Building_Config_File";

    public static final String RUNNING_PGE = "PGETask_Running";

    public static final String CRAWLING = "PGETask_Crawling";

}

Questions

List of questions I (cgoodale) have about the PGE module and how to use it.

...

A. The tasks.xml file in the Workflow configuration contains a property called 'PCS_ActionsIds'. To add 1 action, then set the property like so,

Code Block

<property name="PCS_ActionsIds" value="MyCrawlerActionId"/>

...

  • Where MyCrawlerActionId is the crawler action ID name that you'd like to run in the PGE.
  • NOTE: make sure you also have a reference to PCS_ActionRepoFile within your tasks.xml PGE entry, which points to your crawler's config file. The crawler must support the action ID you specified.
Code Block
<property name="PCS_ActionRepoFile" value="file:[YOUR_OODT_HOME]/crawler/policy/crawler-config.xml" envReplace="true"/>

To add multiple crawler actions, do the following:

  • Add a property in the tasks.xml file, where the name can be whatever you want it to be and set your desired crawler actions there. We'll use ActionsIds as the property name,
Code Block

<property name="ActionsIds" value="MyCrawlerActionId1,MyCrawlerActionId2"/>
  • Note that

...

  • the specified crawler action IDs must be comma-separated with no spaces in-between.
    • In the PGE configuration file, add a PCS_ActionsIds key under the customMetadata tag and reference the property name that you had just set in the tasks.xml file (ActionsIds in this case),
Code Block

<customMetadata>
...
   <metadata key="PCS_ActionsIds" val="[ActionsIds]"/>
...