Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

...

...

Overview of Smoking status

...

SimulatedProdSmokingTAE_CDA.xml is also provided to process CDA documents. The aggregate flow will contain the annotator version ExternalBaseAggregateTAE_CDA.xml which will process the document as a Clinical Document Architecture (CDA) file.

...

  • TokenizerAnnotator (core project)
  • KuRuleBasedClassifierAnnotator

...

...

This annotator is not contained in the aggregate flow, but introduced via the resource settings of the ClassifiableEntriesAnnotator (see the method initialize() in this class). UIMAFramework.produceAnalysisEngine(taeSpecifierStep1, ResMgr, null) instantiates the AE and CasCreationUtils.createCas(taeStep1.getAnalysisEngineMetaData()).getJCas() retrieves the CAS.

...

ProductionPostSentenceAggregate_step2_libsvm.xml

...

  • PcsClassifierAnnotator_libsvm,
  • ArtificialSentenceAnnotator,
  • SentenceAdjuster,
  • SmokingStatusDictionaryLookupAnnotator,
  • NegationAnnotator.

...

...

This annotator is not contained in the aggregate flow, but introduced via the resource settings of the ClassifiableEntriesAnnotator (see the method initialize() in this class). UIMAFramework.produceAnalysisEngine(taeSpecifierStep2, ResMgr, null) instantiates the AE and the ClassifiableEntriesAnnotator process method will process if the smoking status is known.

...

ExternalBaseAggregateTAE.xml

...

  • SimpleSegmentAnnotator,
  • TokenizerAnnotator (core project),
  • SentDetectorAnnotator (core project),
  • LvgAnnotation (LVG project).

...

...

ExternalBaseAggregateTAE_CDA.xml is also provided to process CDA documents. The aggregate flow will contain the specialized class CdaCasInitializer (replacing the SimpleSegmentAnnotator used by flat file/non-CDA version) which will process the document as a Clinical Document Architecture (CDA) file. This annotator is contained in the SimulatedProdSmokingTAE_CDA aggregate. Red text indicates shipped with this annotator.

...

SentenceAdjuster.xml

The file desc/analysis_engine/SentenceAdjuster.xml drives the java class edu.mayo.bmi.smoking.ae.SentenceAdjuster annotator that uses some patterns and some rules about those patterns to adjust certain annotations. This annotator was extended to handle sentence boundaries for the Smoking status classification.

...

UimaDescriptorStep2
(Default Value = '$main_root/desc/analysis_engine/ProductionPostSentenceAggregate_step2_libsvm.xml')
Annotator responsible for second classification step.

...

...

The UimaDescriptorStep1/UimaDescriptorStep2 are introduced as resources via the ClassifiableEntriesAnnotator annotator during the initialization step. This allows the aggregates specified to be instantiated and analysis processing to be handled on a separate asynchronized thread. This enhances performance overall by ensuring the resources required by the process method will have output of the ProductionPostSentenceAggregates prepared without requiring a synchronized data flow (i.e. explicit aggregate flow via component descriptor aggregate flow).

...

KuRuleBasedClassifierAnnotator.xml

...

Resources
BoundaryData
(Default Value = 'file:ss/data/context/boundaryData.txt')
Resource file that provides terms used as sentence boundaries, e.g. '"nevertheless" "how" ";" "."'.

...

...

The parameters provided act the same way that the core's version of the 'NegationAnnotator', but since the boundary stop words are different for the smoking status pipeline, a separate implementation was necessary. However, current release of 'NegationAnnotator' does not use this resource.

...

CAS consumers - RecordResolutionCasConsumer.xml

...