Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

DBConsumer: Stores The DBConsumer is an UIMA Annotation Engine that stores CAS annotatations and the XML CAS representation in the database

 

.

The DBConsumer maps YTEX maps UIMA annotations to a relational database using a table per annotation class. Basically, a table exists for each UIMA annotation class. Primitive annotation attributes are mapped directly to table columns. Our strategy for mapping annotations to the database was to perform a 1-to-1 mapping: what you see in the database should correspond exactly to what you see in the UIMA CAS viewer.

 

Annotation Tables

document

The document table represents a single note/document. The columns are

...

Annotation subclasses may have additional attributes; these attributes are stored in additional tables prefixed with anno_. E.g. additional attributes of the Sentence annotation are stored in the anno_sentence table. The primary key of these annotation subclass tables corresponds to the primary key of the anno_base table (i.e. it is also a foreign key).

anno_token

This is mapped to the edu.mayo.bmi.uima.core.type.NumToken, edu.mayo.bmi.uima.core.type.WordToken, andytex.uima.types.WordToken annotations.

  • anno_base_id - foreign key to anno_base, also primary key for this table
  • tokenNumber - from BaseToken
  • normalizedForm - from BaseToken
  • partOfSpeech - from BaseToken
  • coveredText - the text spanned by this token
  • capitalization - 0 - no caps, 1 - 1 cap letter in word, 2 - 2 cap letters in word, 3 - 3 or more cap letters in word
  • numPosition - 1st position of number within word
  • canonicalForm - uninflected lower case word form, set by LVGAnnotator
  • negated - 1 - word is negated, 0 - word is not negated (based on negex)
  • possible - 1 - possible (from negex)

 

anno_med_event

This is mapped to the cTAKES Medicationevent annotation.

Feature

...

Structure Tables

In addition to Annotations, UIMA defines FeatureStructs; these are typically not 'free standing' annotations - they usually are 'inside' an Annotation. e.g. the Medicationevent and EntityMention annotations have arrays of OntologyConcepts. FeatureStructs are also mapped toanno_[subclass] tables, e.g. OntologyConcepts are mapped to the anno_ontology_concept table, and have a foreign key to the annotation 'within which' they reside (one-to-many relationship).

anno_ontology_concept

This is mapped to the cTAKES OntologyConceptArr of the Medicationevent or EntityMention annotation; these are the concepts (CUIs) of a Named Entity:

  • anno_ontology_concept_id - unique system generated id
  • anno_base_id - foreign key to named_entity
  • code - CUI
  • disambiguated - used by SenseDisambiguatorAnnotator. Set to 1 if this concept is the best sense or only sense for the given named entity. Set to 0 (default) otherwise, or if the annotator is not used.

 

anno_mm_candidate

Metamap Candidate annotations are mapped to this table.

Annotation

...

Relationship Modeling

UIMA annotations can also have references to other UIMA annotations, e.g. the TreeBankNode annotation represents a node in a parse tree. This annotation has reference to a parent and children TreeBankNode annotations. Rows in the anno_link represent Annotation links

...

This is a spring bean configuration file that allows more mapping customization, e.g. mapping attributes to columns with different names.

 

Data Model

Below an entity-relationship diagram

...