...
The DBConsumer maps UIMA annotations to a relational database using a table per annotation class. Basically, a table exists for each UIMA annotation class. Primitive annotation attributes are mapped directly to table columns. Our strategy for mapping annotations to the database was to perform a 1-to-1 mapping: what you see in the database should correspond exactly to what you see in the UIMA CAS viewer.
Note that you must perform the additional YTEX installation tasks to use this component; this involves setting up a database (MySQL/Oracle/SQL Server).
DBConsumer Component Configuration
The DBConsumer UIMA Annotation Engine accepts the following configuration properties:
- Analysis Batch
Typically, you will want to annotate different document collections, or you may want to annotate the same document collection with different pipelines/configurations. The analysis_batch is a way to identify document annotation runs or document collections. It is stored in thedocument.analysis_batch column.
- Store CAS
Should the UIMA XML representation of document annotations be stored in the database? The gzipped uima xml is stored in thedocument.cas column. Set to false to speed up the DBConsumer.
- Store Doc Text
Should the document text be stored in the database? The document text is stored in the document.doc_text column. Set to false to speed up the DBConsumer.
- XMI Output Directory
Directory where UIMA XML representations of document annotations should be stored; if empty they will not be stored in the file system.
- Types to Ignore
UIMA annotations that should not be stored in the database. Add annotations that you are not interested in to this list to speed up the DBConsumer. Take a look at the ref_uima_type table for a list of types stored in the database. The class name should give you an idea of what each annotation represents.
- insert Annotation Containment Links
Should anno_contain entries be created? Set to false to speed up the DBConsumer. See anno_contain (below) for information on what this is.
Annotation Tables
document
The document table represents a single note/document. The columns are
...
- parent_anno_base_id - foreign key to anno_base table. Represents the parent or containing concept.
- parent_uima_type_id - foreign key to ref_uima_type, the class of the parent annotation.
- child_anno_base_id - foreign key to anno_base table. Represents the child or contained concept.
- child_uima_type_id - foreign key to ref_uima_type, the class of the child annotation.
Mapping Configuration
Mapping of Annotations is purely configurative. To map a new annotation do the following:
...