Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

The DBConsumer maps UIMA annotations to a relational database using a table per annotation class. Basically, a table exists for each UIMA annotation class. Primitive annotation attributes are mapped directly to table columns. Our strategy for mapping annotations to the database was to perform a 1-to-1 mapping: what you see in the database should correspond exactly to what you see in the UIMA CAS viewer.

Note that you must perform the additional YTEX installation tasks to use this component; this involves setting up a database (MySQL/Oracle/SQL Server).

DBConsumer Component Configuration

Add the DBConsumer to the end of your pipeline, or add it to your CPE descriptor; the annotator configuration file is YTEX_HOME\desc\ctakes-ytex-uima\desc\analysis_engine\DBConsumer.xml

The DBConsumer UIMA Annotation Engine accepts the following configuration properties:

  • Analysis Batch  
    Typically, you will want to annotate different document collections, or you may want to annotate the same document collection with different pipelines/configurations. The analysis_batch is a way to identify document annotation runs or document collections. It is stored in thedocument.analysis_batch column.
  • Store CAS
    Should the UIMA XML representation of document annotations be stored in the database? The gzipped uima xml is stored in the document.cas column. Set to false to speed up the DBConsumer.  Use the DBAnnotationViewer to view the CAS directly from the database.
  • Store Doc Text
    Should the document text be stored in the database? The document text is stored in the document.doc_text column. Set to false to speed up the DBConsumer.
  • XMI Output Directory
    Directory where UIMA XML representations of document annotations should be stored; if empty they will not be stored in the file system.
  • Types to Ignore
    UIMA annotations that should not be stored in the database. Add annotations that you are not interested in to this list to speed up the DBConsumer. Take a look at the ref_uima_type table for a list of types stored in the database. The class name should give you an idea of what each annotation represents.
  • insert Annotation Containment Links
    Should anno_contain entries be created? Set to false to speed up the DBConsumer. See anno_contain (below) for information on what this is.


 

Using YTEX DBAnnotationViewer

 

For a graphical representation of document annotations, use the DBAnnotationViewer. This modified viewer retrieves the document CAS from the database (as opposed to the plain-vanilla AnnotationViewer which retrieves the CAS from the file system). To run, open a command prompt/shell, and run the following commands.

Windows:

Code Block
languagebash
cd CTAKES_HOME
bin/setenv.bat
java -cp lib/*;desc;resources org.apache.ctakes.ytex.tools.DBAnnotationViewerMain

Linux:

Code Block
languagebash
cd CTAKES_HOME
. bin/ctakes.profile
java -cp lib/*;desc;resources org.apache.ctakes.ytex.tools.DBAnnotationViewerMain


Annotation Tables

document

...

  • parent_anno_base_id - foreign key to anno_base table. Represents the parent or containing concept.
  • parent_uima_type_id - foreign key to ref_uima_type, the class of the parent annotation.
  • child_anno_base_id - foreign key to anno_base table. Represents the child or contained concept.
  • child_uima_type_id - foreign key to ref_uima_type, the class of the child annotation.

 

Mapping Configuration

Mapping of Annotations is purely configurative. To map a new annotation do the following:

...