...
The DBConsumer maps UIMA annotations to a relational database using a table per annotation class. Basically, a table exists for each UIMA annotation class. Primitive annotation attributes are mapped directly to table columns. Our strategy for mapping annotations to the database was to perform a 1-to-1 mapping: what you see in the database should correspond exactly to what you see in the UIMA CAS viewer.
Note that you must perform the additional YTEX installation tasks to use this component; this involves setting up a database (MySQL/Oracle/SQL Server).
DBConsumer Component Configuration
Add the DBConsumer to the end of your pipeline, or add it to your CPE descriptor; the annotator configuration file is YTEX_HOME\desc\ctakes-ytex-uima\desc\analysis_engine\DBConsumer.xml
The DBConsumer UIMA Annotation Engine accepts the following configuration properties:
- Analysis Batch
Typically, you will want to annotate different document collections, or you may want to annotate the same document collection with different pipelines/configurations. The analysis_batch is a way to identify document annotation runs or document collections. It is stored in thedocument.analysis_batch column.
- Store CAS
Should the UIMA XML representation of document annotations be stored in the database? The gzipped uima xml is stored in the document.cas column. Set to false to speed up the DBConsumer. Use the DBAnnotationViewer to view the CAS directly from the database.
- Store Doc Text
Should the document text be stored in the database? The document text is stored in the document.doc_text column. Set to false to speed up the DBConsumer.
- XMI Output Directory
Directory where UIMA XML representations of document annotations should be stored; if empty they will not be stored in the file system.
- Types to Ignore
UIMA annotations that should not be stored in the database. Add annotations that you are not interested in to this list to speed up the DBConsumer. Take a look at theref_uima_type
table for a list of types stored in the database. The class name should give you an idea of what each annotation represents.
- insert Annotation Containment Links
Should anno_contain entries be created? Set to false to speed up the DBConsumer. Seeanno_contain
(below) for information on what this is.
Using YTEX DBAnnotationViewer
For a graphical representation of document annotations, use the DBAnnotationViewer. This modified viewer retrieves the document CAS from the database (as opposed to the plain-vanilla AnnotationViewer which retrieves the CAS from the file system). To run, open a command prompt/shell, and run the following commands.
Windows:
Code Block | ||
---|---|---|
| ||
cd CTAKES_HOME
bin/setenv.bat
java -cp lib/*;desc;resources org.apache.ctakes.ytex.tools.DBAnnotationViewerMain |
Linux:
Code Block | ||
---|---|---|
| ||
cd CTAKES_HOME
. bin/ctakes.profile
java -cp lib/*;desc;resources org.apache.ctakes.ytex.tools.DBAnnotationViewerMain |
Annotation Tables
document
...
- parent_anno_base_id - foreign key to anno_base table. Represents the parent or containing concept.
- parent_uima_type_id - foreign key to ref_uima_type, the class of the parent annotation.
- child_anno_base_id - foreign key to anno_base table. Represents the child or contained concept.
- child_uima_type_id - foreign key to ref_uima_type, the class of the child annotation.
Mapping Configuration
Mapping of Annotations is purely configurative. To map a new annotation do the following:
...