The CAS File Manager is a great data archive tool that is extremely flexible, but sometimes that flexibility can lead to confusion since you can perform a single task (like defining metadata elements) in various ways. This page's focus is to capture the 'Best Practices' people have found on their projects/experience when creating policy.
To try and keep the confusion to a minimum we will start with a Taxonomy to define some key terms, then jump into some Operational Scenarios. The scenario for all the Ops Scenario's will be built around the idea of cataloging and archiving MEDIA (audio, video, images) since most everyone can relate to these items. Some of the simple examples will focus on a single media format, while the more complex examples will show how to deal with an ever changing media library.
Taxonomy
File Manager Policy - All the *.xml files that the filemanager will use to define metadata
Data Set -
Product Type - Logical grouping of products with a MetExtractor and Versioner which is defined in product-types.xml and product-type-element-map.xml.
Virtual Product Type - Use to group metadata elements together, but has no MetExtractor or Versioner. This is only defined within the product-type-element-map.xml file.
Metadata Elements - Data elements that will be cataloged by the File Manager about a product. Metadata Elements must be listed in elements.xml and product-type-element-map.xml.
Operational Scenarios
Simple File Manager Policy
- One set of File Manager Policy
- All of the Data Sets are homogenous
Example: You want to catalog and archive music files. They are logically grouped together by some album, but every song you archive has the same metadata elements. In this case we have a single Data Set called Music, which will be mapped to the default Product Type: GenericFile.
Sample Policy Overview
NOTE: Items that are bold/italic are default policy that come pre-installed with the File Manager and do not need to be edited.
product-types.xml
- GenericFile
elements.xml
- CAS.ProductId
- CAS.ProductName
- CAS.ProductReceivedTime
- Filename
- FileLocation
- ProductType
- ProductStructure
- MimeType
- Album
- Artist
- Track_Number
- Year
- Title
product-type-element-map.xml
- type=GenericFile
- +CAS.ProductId
- +CAS.ProductName
- +CAS.ProductReceivedTime
- +Filename
- +FileLocation
- +ProductType
- +ProductStructure
- +MimeType
- +Album
- +Artist
- +Track_Number
- +Year
- +Title
File Manager Policy with Inheritance
- One set of File Manager Policy
- There are some standard elements common to ALL Data sets
- Data Sets are heterogeneous
Example: You want to catalog and archive music AND video files. Now both of these files can be grouped under the more generic title of MEDIA, and they do share some metadata elements like 'Title' and 'Year', but they start to diverge with format specific terms (i.e. 'sample rate' vs. 'resolution').
So we DO NOT want to repeat the elements that both data sets have in common, so we introduce the idea of PARENT product types within the product-type-element-map.xml file. Whatever elements a PARENT product type contains are inherited by the children. In the Sample Policy below you see that Title and Year are declared ONCE but 2 child product types (Music and Video) can use those elements during ingestion.
Sample Policy Overview
NOTE: Items that are default policy and listed in the first example have been replaced with CAS.DEFAULTS to save space.
product-types.xml
- GenericFile
- Music
- Video
elements.xml
- CAS.DEFAULTS
- Album
- Artist
- Track_Number
- Year
- Title
- Sample_Rate
- Resolution
- Chapters
- Aspect_Ratio
- Director
product-type-element-map.xml
- type=GenericFile
- +CAS.DEFAULTS
- +Year
- +Title
- type=Music parent=GenericFile
- +Album
- +Artist
- +Track_Number
- +Sample_Rate
- type=Video parent=GenericFile
- +Resolution
- +Chapters
- +Aspect_Ratio
- +Director