Formerly named OODT Easy Install : The effort was renamed to RADiX to help delineate it from the core distribution of OODT. RADiX will be both easy and awesome.
Overview
The high level goal of this effort is to build a distribution of OODT that sets up, installs, and runs within five commands. While five commands may seem arbitrary the number serves to push this effort to the edge of ease of setup and configuration required to get going. This distribution of OODT will include both a deployment and source structure for managing the evolution of your installation of OODT.
Maven offers an improved way to export, configure and build OODT called Archetypes. Archetypes simply put are a way to define templates for projects. Within these project templates we will include packaging instructions to conform to the guidelines below to increase the similarity amongst deployments of OODT. Moreover, we will build higher level scripts and configuration to tie the pieces together at the system level. Finally, we will leverage the CAS Install Maven Plugin to take us from our source structure to our deployment structure.
This wiki will be used to capture thoughts, ideas and plans for the first archetypes we develop for OODT. To keep things simple we are going to initially focus on a small number of modules that are typically deployed and configured together. Finally, our goal is to build an 80% solution that works in most cases to get people out of the gates and running with a full OODT solution. We believe this effort will help increase adoption and conformity amongst installations of an already great system.
Assumptions:
- The initial archetype will export RELEASED versions of OODT
- The initial archetype will export Crawler, FileManager and Workflow Manager ONLY (they will be bundled together and configured to work together)
- other modules will be added in the future
- FileManager Policy will be read recursively from the components/filemanager/policy directory. This will remove the requirement to make properties updates when additional policy files are added in sub-directories.
Constraints:
- Archetype will only support a single version for all components. No mixing versions of individual components
- Maven Archetype process will be completed using 5 commands or less
Prerequisites:
- Submit INFRA ticket to create a place to put all Maven Central artifacts
- Load/Install artifacts to Maven Central
The 5 Commands
Requires Maven 2.x and Java 1.5+
prompt> wget http://www.apache.org/dist/oodt/radix-0.4.tgz prompt> tar -xzvf radix-0.4.tgz prompt> export PATH=${PATH}:<downloadDirectory>/radix prompt> oodt-radix <dataSystemName> <packageName> prompt> ./<dataSystemName>/deploy/bin/oodt_pcs start
Get OODT RADiX Distribution
Unpackage OODT RADiX
Add OODT RADiX Commands
OODT Create
OODT Start
Version Control
If you want to manage your OODT RADiX distribution with Subversion.
prompt> svn import <dataSystemName>/source http://your_repo_path/my-pipline/trunk -m "Initial OODT Import"
Default Deployment Structure
In order for the easy installation to work properly we will need to settle on a default deployment layout structure. Below is our plan of how we want to layout the deployment when the project is built. First we will list out an overview, then we can detail each path and what files are going to be saved into each.
/$DEPLOYMENT_BASE_DIR /bin /components /crawler /bin /etc /policy /lib /filemgr /bin /etc /policy /lib /workflow /bin /etc /policy /lib /extensions /bin /etc /lib /webapps /fmprod /fmbrowser /wmonitor /tomcat5 /etc /data /archive /staging /work /met /failure /catalog
Path Descriptions
Path |
Description |
---|---|
/data/archive |
This is the root of where the filemgr will store its archived products |
/data/staging |
This directory will be monitored by the crawler. Products to be ingested should be placed here |
/data/catalog |
In a configuration that uses Lucene as a back end this directory holds the contents of that index |
/data/work |
... |
/data/failure |
Any products that have failed ingestion will be placed here along with any metadata files. |
/bin |
Contains system level scripts to start, stop, restart the OODT infrastructure |
/components |
The guts of what make the data management system work |
/components/crawler |
The crawler deployment for your data management system (i.e. policy, scripts, and configuration). This component is responsible for monitoring the staging area |
/components/filemgr |
The filemgr deployment for your data management system (i.e. policy, scripts, and configuration). This component catalogs and archives products into the archive area. |
/components/workflow |
The workflow deployment for your data management system (i.e. policy, scripts, and configuration). This component orchestrates any processing that may need to be done on your products |
/components/extensions |
This is a sandbox area to test out metadata extractors, versioners, actions, etc. that you have developed to extend the functionality of the existing OODT framework. |
/conf |
System wide configuration |
Deployment Path Details
/$DEPLOYMENT_BASE_DIR/bin- This will contain scripts that will manipulate the underlying components. For example all 3 components can be started, stopped and restarted from this directory. At the same time you can also manipulate a single component at a time from this dir also.
Manipulate all components (DEFAULT BEHAVIOR)
./oodt [start, stop, restart]
Manipulate a single component
_./oodt [start,stop, restart] [crawler OR filemanager OR workflowmanager]_
*/$DEPLOYMENT_BASE_DIR/components - This will contain a single folder for each component. Initially this will only contain the 3 components we have selected to start this process, but as more components are added they will be added in here.
/$DEPLOYMENT_BASE_DIR/conf - This will contain configuration and properties files which apply to several components. This should (like the bin dir) give users a single directory they can go into to configure the associate components.
Parameters that can be managed within the conf directory
oodt.properties
crawler_port=9020 filemanager_port=9000 workflowmanager_port=9001 resmgr_port=9002 batchstub_port=2001
JAVA_HOME
Component settings we plan to default
crawler
port
filemanager
FILEMGR_PORT=9000 export FILEMGR_PORT
workflow
WFMGR_PORT=9001 export WFMGR_PORT
Default Source Structure
Source Path Details
Future Work
Once the above is complete our thoughts are that the next items to be incorporated are as follows:
- Tomcat Distribution
- OODT Services (Health Monitor, ?)
- OODT Web Apps (Curator, ?)
- CAS PGE
- Expand OODT Easy Commands
- upgrade - to allow for upgrades in OODT components
- status - to print out the version of OODT running and component status
- add_product_type - to configure all components with a new product type
Maven Archetype Information
Requirements to getting Artifacts Synched with Maven Central:
https://docs.sonatype.org/display/Repository/Central+Sync+Requirements