Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: finish the dictionary version conversion section

...

 

 

Step

 

 

Example

1. Do no try this until AFTER cTAKES 4.0 is released (this 4.0 document is still a DRAFT): On the cTAKES downloads page, download the User Installation package.

Info

The download time will be commensurate with ~650 MB of data.

 

2. (Recommended) Verify the downloaded files against a signature to ensure you have the proper and complete file.

From the following directory, download the signature file that corresponds to your download from step 1

https://www.apache.org/dist/ctakes/ctakes-4.0.0/ 

Please do not download any of the files that end with .zip or .gz directly from apache.org/dist - use the downloads page listed in step 1 if you need to download cTAKES itself so that a mirror can be used.

No example

3. Unzip the file you downloaded into a directory that you want to be the cTAKES install location. The compressed files contain a single directory at the top level. This folder we will call <cTAKES_HOME>. It is the directory that contains subdirectories like bin, desc, resources, lib.

You will need to refer to this directory later.

Windows:

Code Block
languagenone
C:\apache-ctakes-4.0.0

Linux:

Code Block
languagenone
/usr/local/apache-ctakes-4.0.0

Windows:

Linux:

Code Block
languagenone
tar -xvf apache-ctakes-4.0.0.bin.tar.gz -C /usr/local

4. Download the cTAKES resources ZIP file with a matching version from the ctakesresources project (More information on cTAKES models). These resources are required to operate cTAKES.

Info

Due to licensing considerations, resources are hosted at an external location. For ease of installation, a single package was created with all the resources you will need. Licensing for these resources is found within the download.

Info

Download time will be commensurate with 1GB of data.


Unzip the cTAKES resources file into a temporary location.

Windows:


Linux:

Code Block
langnone
 
cd /tmp
 
wget http://sourceforge.net/projects/ctakesresources/files/ctakes-resources-4.0-bin.zip
 
sudo unzip ctakes-resources-4.0-bin.zip
 
 

5. Copy (or move) the resources to cTAKES_HOME.
Copy the contents of the temporary resources directory (and all sub-directories) to <cTAKES_HOME>/resources.

Info

There may be conflicts while taking this action. Overwrite the cTAKES_HOME files with those in the resources download.

Windows:

Code Block
langnone
xcopy /s C:\temp\ctakes-resources-4.0-bin\resources C:\apache-ctakes-4.0.0\resources

Linux:

Code Block
langnone
cp -R /tmp/resources/* /usr/local/apache-ctakes-4.0.0/resources

Mac OSX:

Code Block
langnone
ditto /tmp/resources/* /usr/local/apache-ctakes-4.0.0/resources
6. If you created your own dictionaries for use with a previous release of cTAKES and you plan to use them with cTAKES 4.0, you must convert your dictionaries to be compatible with cTAKES 4.0, which is described in the next section. The dictionaries installed by the above steps do not need to be converted. 

7.

 
8. Open the HSQL DB GUI, to convert the DB to 
9. Save 

10. Close the GUI or

 
11. If the, set it back to read-only 
12. Repeat the preceding steps for each of your dictionaries that is in an HSQL DB. 

Convert Dictionaries You've Previously Created to be Compatible with cTAKES 4.0

Convert Dictionaries You've Previously Created to be Compatible with cTAKES 4.0

Info

cTAKES 4.0.0 uses HSQLDB 2.3.4. Previous version of cTAKES used HSQLDB

Info

cTAKES 4.0.0 uses HSQLDB 2.3.4 rather than HSQLDB 1.8. Dictionaries created with HSQLDB 1.8 need to be converted before they can be used by cTAKES 4.0.

CTAKESHOMEjar for cTAKES don't have a copy any more
StepExample
  1. If you created your own HSQLDB dictionaries for use with a previous release of cTAKES and you plan to use those dictionaries with cTAKES 4.0, you must convert your dictionaries to be compatible with cTAKES 4.0. The dictionaries installed in the preceding section do not need to be converted.
No example
2.  If your dictionary's .properties file sets your dictionary's database to be read-only, you need to change it before you can convert it.
  • locate Suggested: make a copy of your database directory for use with 4.0, so that the filename.properties file for your database, where and filename.script and any other files in that directory are duplicated, where filename is dependent on what you named your database
  • locate the filename.properties file for your database
  • remove these lines, if present:

      readonly=true 
      files_readonly=true

  • save the filename.properties file
 

3.  Open the database with the 1.8 hsqldb jar:

Locate the 1.8 hsqldb jar that you used when you created the database (for example,

<CTAKES_

HOME>/lib/hsqldb-1.8.0.10.

jar if you used the cTAKES 3.2.2 convenience binary)

If you

need to, you can download it from Maven Central at:

http://central.maven.org/maven2/org/hsqldb/hsqldb/1.8.0.10/hsqldb-1.8.0.10.jar

 

4.  Open the Open the HSQLDB manager GUI for version 1.8. For example, if your 1.8 jar is in C:\Apps\hsqldb\, you would enter this command:

java -cp  C:\Apps\hsqldb\hsqldb-1.8.0.10.jar   org.hsqldb.util.DatabaseManager

 Image Removed

5. Connect to your database, by entering the appropriate URL and pressing the Ok button.

For example, if you are on Windows and your dictionary's.properties file is

C:\cTAKES_3\resources\org\apache\ctakes\dictionary\lookup\fast\customcustomdict\custom.properties

you could enter the following for the URL

jdbc:hsqldb:\cTAKES_3\resources\org\apache\ctakes\dictionary\lookup\fast\customcustomdict\custom

 

 Image Added

4.  Using HSQLDB 1.8, in

6. In

the upper right pane, enter

SHUTDOWN COMPACT and press Execute
 

7. Exit the Database Manager GUi

 
8. (now do same with hsqldb 2.3.4 jar) 

SET SCRIPTFORMAT TEXT and press the Execute button.

After the update count appears, go to the next step.

 

5.Using HSQLDB 1.8, in the upper right pane, enter SHUTDOWN COMPACT and press the Execute button.

After the update count appears, exit the Database Manager GUI.

Image Added 

5. Now do same with hsqldb 2.3.4 jar - open the HSQLDB 2.3.4 database manager GUI:

java  -cp C:\apache-ctakes-4.0.0\lib\hsqldb-2.3.4.jar   org.hsqldb.util.DatabaseManager

Connect to your database, by entering the appropriate URL and pressing the Ok button.

In the upper right pane, enter SHUTDOWN COMPACT and press the Execute button.

Exit the Database Manager GUI.

 

6. Verify the filename.properties file for your database contains version=2.3.4

If it doesn't, make sure

   the  .properties  file does not have  readonly=true

   the  .properties  file does not have  files_readonly=true

   you used hsqldb-2.3.4.jar when instructed to

 

7. Suggested: Set your dictionary's database to be read-only, by adding readonly=true to the filename.properties file.

 
8. Repeat the above steps for each of your dictionaries that you had created for use with a previous release of cTAKES.9. 

(Recommended) Add UMLS access rights

...

Also, before you go on to process text in production, you will want to consider dictionaries and models. If you did not obtain the rights yet to the UMLS resources and models, you will want to do so. Be aware, the models within cTAKES have been trained on data that may not match your data well enough to be effective. In some cases you might want to modify the dictionaries and train models using your own data.