Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Step

Example

1. Import the cTAKES projects using Maven.

File -> Import ... -> Maven -> Check out Maven Projects from SCM.
Click Next.


2. For SCM URL use "svn" in the drop-down

Code Block
https://svn.apache.org/repos/asf/incubator/ctakes/tags/ctakes-3.0.0-incubating

in the text field.
Click Finish.
Eclipse will download and build all of the cTAKES projects including running jcasgen as needed.


<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="e01f64c2-5b91-46ae-9653-c96aa7496660"><ac:plain-text-body><![CDATA[

3. Download [[cTAKES 3.0 Dictionaries and models

dictionary and model resources.
]].
]]></ac:plain-text-body></ac:structured-macro>

Info

Due to

Info

Due to licensing considerations and easy of installability, one download from an external location was established with all the resources you will need. Licensing for these resources is found within the download.

Go to http://sourceforge.net/projects/ctakesresources/files/ and download the ZIP file with a matching version from the ctakesresources project.
Download time will be commensurate with 1GB of data.
Unzip the files into a temporary location.

Windows:

Code Block
langnone
C:\temp

Linux:

Code Block
langnone
/tmp

4. Copy (merge, no files should be named the same) the resources directory (and all sub-directories) to <cTAKES_HOME>/resources.The destination will not yet exist.

Windows:

Code Block
langnone
xcopy /s C:\temp\ctakes-resources-3.1.0\resources C:\Users\m075861\workspace\ctakes\resources
xcopy /s C:\temp\ctakes-resources-3.1.0\resources C:\apache-ctakes-3.0.0-incubating\resources

Linux:

Code Block
langnone
copy /tmp/ctakes-resources-3.1.0/resources/org /usr/local/apache-ctakes-3.0.0-incubating/resources/org

5. UMLS user ID and password.
Usually the dictionaries are required to process data. If you plan to utilize the UMLS dictionaries you must pass your UMLS user ID and password to the pipeline. There are several ways to do this:

Note
titleNote

If you do not have a UMLS username and password, you may request one at UMLS Terminology Services


  1. Environment variable - Set or export environment variable

    No Format
    
    export ctakes.umlsuser=<username>, ctakes.umlspw=<password>
    
  2. Add the system properties to the Java arg

    No Format
    
    -Dctakes.umlsuser=<username> -Dctakes.umlspw=<password>
    
  3. Change the UMLSUser and UMLSPW <nameValuePair> strings in these descriptor files with your UMLS username and password.
    * Dictionary Lookup: <cTAKES_HOME>/desc/ctakes-dictionary-lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml
  • (optional) Drug NER: <cTAKES_HOME>/desc/ctakes-drug-ner/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml
    The following shows where in the files you would make the changes. (Do not change the <configurationParameters> by the same name.)
    Code Block
    languagenone
          <nameValuePair>
            <name>ctakes.umlsuser</name>
            <value>
              <string>YOUR_UMLS_USERNAME_HERE</string>
            </value>
          </nameValuePair>
          <nameValuePair>
            <name>ctakes.umlspw</name>
            <value>
              <string>YOUR_UMLS_PASSWORD_HERE</string>
            </value>
          </nameValuePair>


    Now include the DictionaryLookupAnnotatorUMLS.xml Analysis Engine within your aggregate Analysis Engine or switch to the ones provided by cTAKES. cTAKES has provided duplicates of shipped Analysis Engine descriptors, put UMLS in the name, and placed DictionaryLookupAnnotatorUMLS.xml within them for these components:
    • Dictionary Lookup
    • Clinical Documents pipeline
    • Drug NER
    • Side Effect
      *So you simply need to switch to using those descriptors. For example, if you were using AggregateCdaProcessor.xml in the Clinical Documents pipeline you would switch to using AggregateCdaUMLSProcessor.xml instead and you will now hook into the complete dictionaries.

      You can, of course, modify your own aggregate Analysis Engine files and place the DictionaryLookupAnnotatorUMLS.xml Analysis Engine within them.
      Since this is an in-memory database implementation, please be patient during the initial load as it could take approximately 20-30 seconds for the database to initialize.

No example

Process documents using cTAKES

...