...
Once you have finished installing cTAKES and its separately-bundled resources, you will be able to see what cTAKES is capable of.
Prerequisites
Step
| Example | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. Make sure you have Java 1.7 or higher.
| Windows:
|
Install cTAKES
Step
| Example | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. On the cTAKES downloads page, download the User Installation package.
|
| |||||||||||||||||
2. (Recommended) Verify the downloaded files against a signature to ensure you have the proper and complete file. From the following directory, download the signature file that corresponds to your download from step 1 https://www.apache.org/dist/ctakes/ctakes-4.0.0/ Please Please do not download any of the files that end with .zip or .gz directly from apache.org/dist - use the downloads page listed in step 1 if you need to download cTAKES itself so that a mirror can be used. | No example | |||||||||||||||||
3. Unzip the file you downloaded into a directory that you want to be the cTAKES install location. The compressed files contain a single directory at the top level. This folder we will call <cTAKES_HOME>. You will need to refer to this directory later.
Linux:
| Windows:
| |||||||||||||||||
4. Download the cTAKES resources ZIP file with a matching version from the ctakesresources project (More information on cTAKES models). These resources are required to operate cTAKES.
| Windows:
| |||||||||||||||||
5. Copy (or move) the resources to cTAKES_HOME.
| Windows:
Linux:
Mac OSX:
|
...
Note |
---|
In the initial setup cTAKES will recognize only few sample concepts in text. If you wish to perform named entity recognition or concept identification for anything other than these few words, you will need to 1) obtain the rights to use UMLS resources 2) add those credentials to cTAKES, and 3) use a cTAKES pipeline that makes use of those UMLS resources. If you don't, cTAKES will work but won't recognize much. |
Step
| Example | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. If you do not have a UMLS username and password, you may request one at UMLS Terminology Services. | No example | |||||||||||||||||||
2. Once you have your UMLS username and password, edit the following files. Find the lines in each script that runs java and add the ctakes.umlsuser and ctakes.umlspw parameters to the java command with your credentials. Make sure you substitute your actual ID and password if you cut and paste the example.
Linux:
|
If you use special characters in your user name or password, you may need to escape them or for windows, place the string in quotes For example, if your username and password were literally myusername and mypassword, you could insert them before the -cp option so the start of the java command would look like this:
Windows: If you use special characters in your umls user name or password, you can place them in double-quotes:
Linux: If you use special characters in your user name or password, you may need to escape them |
...
CAS Visual Debugger (CVD)
Step
| Example | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. Open a command prompt and change to the cTAKES_HOME directory, which is the directory that contains subdirectories like bin, desc, resources, lib. Depending on how you extracted the files,
|
Linux:
| ||||||||||||
2. Start the CAS Visual Debugger and load the AggregatePlaintextFastUMLSProcessor pipeline by running this command:
The GUI opens and then loads the AggregatePlaintextFastUMLSProcessor pipeline. If it appears to be hung, look at the window where you entered the command and you will see what is happening. Once the analysis engine has successfully loaded you should see a tree in the Analysis Results frame:
| Windows:
Linux:
| ||||||||||||
43. Copy the example text from the next cell in this table and paste the contents into the Text section of CVD, replacing the text that is already there. |
| ||||||||||||
5. From the menu bar, click Run -> Run AggregatePlaintextFastUMLSProcessor.
Note: If you would like to TEST some simple annotators to ensure it's working without UMLS, you can just load: /desc/ctakes-core/desc/analysis_egine/SentencesAndTokensAggregate.xml | |||||||||||||
6. You'll get a list of all the annotations for this clinical document in the Analysis Results frame. Annotations such as named entities, division by sentence, etc from the pipeline are viewable. To see one, in the Analysis Results frame, click on the key in front of:
This will show an AnnotationIndex in the lower frame. Select any annotation in that lower frame and you will see the text discovered in
Now select items in the lower frame to see the text being annotated. |
Collection Processing Engine (CPE)
Step
| Example | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. Open a command prompt and change to the cTAKES_HOME directory:
|
Linux:
| ||||||||||||
2. Create a directory for some test data. |
| ||||||||||||
3. Download this sample file and place it into the testdata directory. | No example | ||||||||||||
4. Start the collection processing engine by running this command: | Windows:
Linux:
| ||||||||||||
5. This will bring up the Collection Processing Engine Configurator. In the Menu bar click File >Open CPE Descriptor | |||||||||||||
6. Navigate to the following file, which uses the AggregateCdaProcessor
| No example | ||||||||||||
7. Change the Collection Reader input directory to testdata, which contains a CDA file(s). Within the CAS Consumers pane of the same window, change the output directory to testdata/output | |||||||||||||
8. Click the Play button (green/blue play arrow near the bottom).
| |||||||||||||
9. You should see that one document was processed. You did process a collection of documents. In this case the collection only contained one just to show how to do it. Close the results window.
| |||||||||||||
10. Close the CPE application. You may be prompted to save changes. Since this was just a test you may click the No button. | No example |
...
The cTAKES GUI can be launched using the bin\runctakesGUI.bat or bin\runctakesGUI.sh file.
Step 1: Open a command prompt and change to the cTAKES_HOME directory, which is the directory that contains subdirectories like bin, desc, resources, lib.
Step 2 for Windows: bin\runctakesGUI.bat
Step 2 for Linux: bin\runctakesGUI.sh
Step 3: Allow the GUI to scan for
Step 4: Select which elements to include in your pipeline
Step 5: Run the pipeline using the Run icon (TODO - insert picture of icon here)
Step 6: Examine your output.
The analysis engines and collection processing engines The analysis engines and collection processing engines shipped with cTAKES for some of the annotators are described in the following table.
...
Annotator | Description | Example Piper file | Example Collection processing Engine (CPE) |
---|---|---|---|
Clinical Pipeline | The pipeline to obtain concepts and their attributes | <cTAKES_HOME>/TBD | <cTAKES_HOME>/desc/ctakes-clinical-pipeline/desc/collection_processing_engine/test1.xml |
Chunker | Obtains phrasal chunk annotations | <cTAKES_HOME>/TBD | NA |
Dependency Parser | Obtains dependency parsing tree | <cTAKES_HOME>/TBD | <cTAKES_HOME>/desc/ctakes-dependency-parser/desc/collection_processing_engine/ClearParserTestCPE.xml |
Drug NER | Finds mentions of medications and medication attributes such as dose, strength, frequency... | <cTAKES_HOME>/TBD | <cTAKES_HOME>/desc/ctakes-drug-ner/desc/collection_processing_engine/DrugNER_PlainText_CPE.xml |
Dictionary Lookup | Finds mentions of concepts from a dictionary (e.g., SNOMED CT or RxNorm | <cTAKES_HOME>/TBD | NA |
Dictionary Lookup Fast | Finds mentions of concepts from a dictionary (e.g., SNOMED CT or RxNorm | <cTAKES_HOME>/TBD | NA |
Relation Extractor | Finds certain relations (location of and degree of) between certain Event, Entity, and Modifier annotations | <cTAKES_HOME>/TBD | N/A |
Smoking Status | Finds document or patient-level smoking status | <cTAKES_HOME>/TBD | <cTAKES_HOME>/desc/ctakes-smoking-status/desc/collection_processing_engine/Sample_SmokingStatus_output_flatfile.xml |
Side Effect | Finds side effect mentions and sentences from clinical documents | <cTAKES_HOME>/TBD | <cTAKES_HOME>/desc/ctakes-side-effect/desc/collection_processing_engine/SideEffectCPE.xml |
...
Next Steps
The cTAKES 4.0 Component Use Guide will help you to understand each of the cTAKES components that have been installed. In some cases you can learn how to improve the components.
...