...
Step | Example | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. If you do not have a UMLS username and password, you may request one at UMLS Terminology Services. | No example | |||||||||||||||||
2. Edit the following files. Find the line in each script that runs java and add the ctakes.umlsuser and ctakes.umlspw parameters to the java command with your credentials. Make sure you substitute your actual ID and password if you cut and paste the example.
Linux:
|
For example, if your username and password were literally myusername and mypassword, you could insert them before the -cp option so the start of the java command would look like this:
|
Process documents using cTAKES
...
Step | Example | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. Open a command prompt and change to the cTAKES_HOME directory.
|
Linux:
| ||||||||||||
2. Start the CAS Visual Debugger by running this command: | Windows:
Linux:
| ||||||||||||
3. Copy the example text from the next cell in this table and paste the contents into the Text section of CVD, replacing the text that is already there. |
| ||||||||||||
4. An analysis engine (AE) needs to be loaded in order to process text.
Use the Run-> Load AE menu bar command. Navigate to the file
Click Open.
| |||||||||||||
5. From the menu bar, click Run -> Run AggregatePlaintextFastUMLSProcessor.
Note: If you would like to TEST some simple annotators to ensure it's working without UMLS, you can just load: /desc/ctakes-core/desc/analysis_egine/SentencesAndTokensAggregate.xml | |||||||||||||
6. You'll get a list of all the annotations for this clinical document in the Analysis Results frame. Annotations such as named entities, division by sentence, etc from the pipeline are viewable. To see one, in the Analysis Results frame, click on the key in front of:
This will show an AnnotationIndex in the lower frame. Select any annotation in that lower frame and you will see the text discovered in
Now select items in the lower frame to see the text being annotated. |
...
Annotator | Description | Example Aggregate Analysis Engine (AE) | Example Collection processing Engine (CPE) | |||||
---|---|---|---|---|---|---|---|---|
Clinical Document Pipeline | The complete cTAKES pipeline to obtain majority of cTAKES annotations | <cTAKES_HOME>/desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml | <cTAKES_HOME>/desc/ctakes-clinical-pipeline/desc/collection_processing_engine/test1.xml | |||||
Chunker | Obtain cTAKES chunk annotations | NA | NA | |||||
Dependency Parser | Obtain dependency parsing tree | <cTAKES_HOME>/desc/ctakes-dependency-parser/desc/analysis_engine/ClearParserSRLTokenizedInfPosAggregate.xml | <cTAKES_HOME>/desc/ctakes-dependency-parser/desc/collection_processing_engine/ClearParserTestCPE.xml | Drug NER | The annotator to obtain drug annotationsdependency parsing tree | <cTAKES_HOME>/desc/ctakes-drugdependency-nerparser/desc/analysis_engine/DrugAggregatePlaintextUMLSProcesorClearParserSRLTokenizedInfPosAggregate.xml | <cTAKES_HOME>/desc/ctakes-drugdependency-nerparser/desc/collection_processing_engine/DrugNER_PlainText_CPEClearParserTestCPE.xml | |
Drug NER | The annotator to obtain drug annotations | Dictionary Lookup | Mapping cTAKES annotations to dictionaries (e.g., SNOMED_CT or RxNorm | <cTAKES_HOME>/desc/ctakes-dictionarydrug-lookupner/desc/analysis_engine/TestAggregateTAE.xml | NA | PAD Term Spotter | Identifying terms related to PAD/DrugAggregatePlaintextUMLSProcesor.xml | <cTAKES_HOME>/desc/ctakes-paddrug-term-spotterner/desc/analysiscollection_processing_engine/RadiologyDrugNER_PlainText_TermSpotterAnnotatorTAECPE.xml |
Dictionary Lookup | Mapping cTAKES annotations to dictionaries (e.g., SNOMED_CT or RxNorm | <cTAKES_HOME>/desc/ctakes-paddictionary-term-spotterlookup/desc/collection_processinganalysis_engine/Radiology_SampleTestAggregateTAE.xml | NA | |||||
Relation Extractor | Annotate certain relations between certain Event, Entity, and Modifier annotations | <cTAKES_HOME>/desc/ctakes-relation-extractor/desc/analysis_engine/RelationExtractorAggregate.xml | N/A | |||||
Smoking Status | The annotator to obtain document or patient-level smoking status | <cTAKES_HOME>/desc/ctakes-smoking-status/desc/analysis_engine/SimulatedProdSmokingTAE.xml | <cTAKES_HOME>/desc/ctakes-smoking-status/desc/collection_processing_engine/Sample_SmokingStatus_output_flatfile.xml | |||||
Side Effect | The annotator to find side effect mentions and sentences from clinical documents | <cTAKES_HOME>/desc/ctakes-side-effect/desc/analysis_engine/SideEffectAggregateTAE_UMLS.xml | <cTAKES_HOME>/desc/ctakes-side-effect/desc/collection_processing_engine/SideEffectCPE.xml |
...
Also, before you go on to process text in production, you will want to consider dictionaries and models. If you did not obtain the rights yet to the UMLS resources and models, you will want to do so. Be aware, the models have been trained on data that may not match your data well enough to be effective. In some cases you might want to modify the dictionaries and train models using your own data.