DBConsumer: Stores The DBConsumer is an UIMA Annotation Engine that stores CAS annotatations and the XML CAS representation in the database
.
The DBConsumer maps YTEX maps UIMA annotations to a relational database using a table per annotation class. Basically, a table exists for each UIMA annotation class. Primitive annotation attributes are mapped directly to table columns. Our strategy for mapping annotations to the database was to perform a 1-to-1 mapping: what you see in the database should correspond exactly to what you see in the UIMA CAS viewer.
Annotation Tables
document
The document table represents a single note/document. The columns are
...
Annotation subclasses may have additional attributes; these attributes are stored in additional tables prefixed with anno_. E.g. additional attributes of the Sentence annotation are stored in the anno_sentence table. The primary key of these annotation subclass tables corresponds to the primary key of the anno_base table (i.e. it is also a foreign key).
anno_token
This is mapped to the edu.mayo.bmi.uima.core.type.NumToken, edu.mayo.bmi.uima.core.type.WordToken, andytex.uima.types.WordToken annotations.
- anno_base_id - foreign key to anno_base, also primary key for this table
- tokenNumber - from BaseToken
- normalizedForm - from BaseToken
- partOfSpeech - from BaseToken
- coveredText - the text spanned by this token
- capitalization - 0 - no caps, 1 - 1 cap letter in word, 2 - 2 cap letters in word, 3 - 3 or more cap letters in word
- numPosition - 1st position of number within word
- canonicalForm - uninflected lower case word form, set by LVGAnnotator
- negated - 1 - word is negated, 0 - word is not negated (based on negex)
- possible - 1 - possible (from negex)
anno_med_event
This is mapped to the cTAKES Medicationevent annotation.
Feature
...
Structure Tables
In addition to Annotations, UIMA defines FeatureStructs; these are typically not 'free standing' annotations - they usually are 'inside' an Annotation. e.g. the Medicationevent and EntityMention annotations have arrays of OntologyConcepts. FeatureStructs are also mapped toanno_[subclass] tables, e.g. OntologyConcepts are mapped to the anno_ontology_concept table, and have a foreign key to the annotation 'within which' they reside (one-to-many relationship).
anno_ontology_concept
This is mapped to the cTAKES OntologyConceptArr of the Medicationevent or EntityMention annotation; these are the concepts (CUIs) of a Named Entity:
- anno_ontology_concept_id - unique system generated id
- anno_base_id - foreign key to named_entity
- code - CUI
- disambiguated - used by SenseDisambiguatorAnnotator. Set to 1 if this concept is the best sense or only sense for the given named entity. Set to 0 (default) otherwise, or if the annotator is not used.
anno_mm_candidate
Metamap Candidate annotations are mapped to this table.
Annotation
...
Relationship Modeling
UIMA annotations can also have references to other UIMA annotations, e.g. the TreeBankNode annotation represents a node in a parse tree. This annotation has reference to a parent and children TreeBankNode annotations. Rows in the anno_link represent Annotation links
...
This is a spring bean configuration file that allows more mapping customization, e.g. mapping attributes to columns with different names.
Data Model
Below an entity-relationship diagram
...