Wiki Markup |
---|
{scrollbar} |
Section | ||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Overview of Clinical Documents Pipeline
This project is the top-level, main project for processing a clinical document through the entire cTAKES pipeline, including sentence detection, part of speech tagging POS, chunking, named entity recognition, context detection, and negation detection.
...
- plain text files
- Clinical Document Architecture (CDA) XML files that conform to the DTD provided
Analysis engines (annotators)
AggregateCdaProcessor.xml for CDA documents conforming to the provided DTD
The file desc/analysis_engine/AggregateCdaProcessor.xml is the aggregate analysis engine to use to run the entire pipeline, including the CdaCasInitialzer analysis engine, which reads CDA documents that conform to the DTD provided, and create Segment annotations based on the sections within the CDA document.
...
ChunkerCreatorClass - the full class name of an implementation of the interface edu.mayo.bmi.uima.chunker.ChunkerCreator
AggregatePlaintextProcessor.xml for plain text documents
The file desc/analysis_engine/AggregatePlaintextProcessor.xml is the aggregate analysis engine to use to run the entire pipeline, including the SimpleSegmentAnnotator analysis engine, which creates a Segment annotation that wraps the entire plain text document. Other annotators in the pipeline require at least one Segment annotation.
...