You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

YTEX is in the current cTAKES trunk (as of 1-May-2014) ( and will be included in cTAKES 3.2.  This document describes additional installation steps required to take advantage of the following YTEX features:

  • Semantic Similarity & Word Sense Disambiguation
  • Storing annotations in a relational database
  • Exporting annotations to machine learning tools


Database Prerequisites

YTEX supports MS SQL Server 2008 and above, MySQL version 5.x, and Oracle versions 10gR2 and above. Create a database user (and schema) for use with ytex. See platform specific notes below.


As documented here your database must use the UTF-8 charset.

Make sure you use a tablespace with enough room; e.g. create the ytex user and schema like this:

create tablespace TBS_YTEX datafile 'C:/oracle/oradata/orcl/TBS_YTEX.dbf' size 1000M autoextend on online;
create user ytex identified by ytex default tablespace TBS_YTEX;
alter user ytex quota unlimited on TBS_YTEX;
grant connect, resource to ytex;
grant create materialized view to ytex;
grant create view to ytex;

If you have installed the UMLS locally, you must also grant ytex select permissions on umls tables; e.g. assuming that umls tables are in the umls schema:

grant select on umls.MRCONSO to ytex;
grant select on umls.MRSTY to ytex;
grant select on umls.MRREL to ytex;


To create the mysql user and database, login to mysql as root and run the following commands (change as necessary):

CREATE USER 'ytex'@'localhost' IDENTIFIED BY 'ytex';
GRANT ALL PRIVILEGES ON ytex.* TO 'ytex'@'localhost';
On mac you should use the instead of localhost. Note that if ytex connects to the mysql server from a different machine, you should replace localhost with the host name or ip address of the machine you will connect from, or use the wildcard ('%'):
CREATE USER 'ytex'@'%' IDENTIFIED BY 'ytex';
GRANT ALL PRIVILEGES ON ytex.* TO 'ytex'@'%';

If you have installed UMLS in your database, you must give the ytex user select permission on these tables:

GRANT SELECT on umls.* to 'ytex'@'%';

The document table uses the text and blob datatypes for the doc_text column that holds the document text. If you are processing large documents, you may need to use the longtext datatype instead. Furthermore, you may have to increase the maximum packet size.

SQL Server

You must have the permission to create database objects in the YTEX database and schema. If you don't have these permissions, ask your DBA to add you to the db_ddladmin & db_datawriter roles for the YTEX database.

If you want to install the UMLS in your SQL Server, you may want to use a different database/schema from the YTEX database. If that is the case, you need permissions on the UMLS database/schema as well.


0) Build cTAKES Trunk

All of ytex has been moved into ctakes, it is currently in trunk ( You must build a ctakes distribution (that includes ytex).

    • Open a command prompt
    • Ensure that maven, svn, and the JDK 1.7 (64-bit version) are in your PATH variable
    • cd to some directory where you want to check stuff out (I like c:\temp)
    • run the following commands
rmdir /s /q ctakes
svn co ctakes
cd ctakes
mvn clean install -DskipTests

And you will have the ctakes (with ytex) distro in ctakes\ctakes-distribution\target\

1) Install ctakes+ytex distro 'as usual'

Go through the standard ctakes installation for the distribution you just created: See For the rest of this document, we assume ctakes is installed in CTAKES_HOME

2) Unzip YTEX Libraries

Download and unzip 'over' your installation. This contains non-APACHE 2.0 license compliant libraries:

  • Hibernate
  • Weka
  • MySQL JDBC Driver
  • MS SQL Server JDBC Driver

If you are using oracle, download the oracle jdbc driver ojdbc7_g and place it in your CTAKES_HOME\lib directory.

3) Unzip YTEX Resources (Optional - UTS login required)

Download and unzip 'over' your installation. This contains:

  • Concept Graphs derived from the UMLS2013AA used to compute semantic similarity measures
  • Dictionary Lookup table derived from UMLS2013AA for named entity recognition.

If you do not install these files, Word Sense Disambiguation will be disabled, and default ytex dictionary lookup will be limited to a small sample subset of the UMLS

You can always create concept graphs for WSD from your UMLS installation. If you have the UMLS in your DB, YTEX will create a dictionary lookup table from the UMLS during the installation.

4) Edit environment batch/shell script

Fix the path references to match your environment.

  • windows - no changes necessary; see CTAKES_HOME\bin\setenv.cmd
  • linux -
    • move CTAKES_HOME/bin/ctakes.profile to ${HOME}/ctakes.profile
    • edit the CTAKES_HOME environment variable
    • make executable - chmod u+x ${HOME}/ctakes.profile

5) Create CTAKES_HOME\resources\org\apache\ctakes\ytex\

In this file, you specify the database connection parameters. Use CTAKES_HOME\resources\org\apache\ctakes\ytex\<db type>.example as a template. If you have UMLS installed on your database, specify the umls.schema and umls.catalog properties (see the properties file for an explanation of what these are).

6) Install the UMLS in your database (Optional)

We strongly suggest that you install UMLS in your database.

7) Execute the setup script

windows: Open a command prompt, navigate to CTAKES_HOME, and execute setup script:

cd /d c:\java\apache-ctakes-3.1.2-SNAPSHOT\bin\ctakes-ytex\scripts
..\..\ant.bat -f build-setup.xml all > setup.out 2>&1

linux: From a shell, cd to the CTAKES_HOME directory, set the environment, make sure necessary scripts are executable, and execute the ant script:

chmod u+x ${HOME}/ctakes.profile
. ${HOME}/ctakes.profile
cd ${CTAKES_HOME}/bin
chmod u+x ant
chmod u+x *.sh
cd ctakes-ytex/scripts
nohup ../../ant -f build-setup.xml all > setup.out 2>&1 &
tail -f setup.out
Check setup.out to make sure the setup was succesful

This will call the ant script build-setup.xml, which does the following:

  • Generates configuration files from templates
  • Sets up YTEX Database Objects


The installation executes SQL scripts located in the CTAKES_HOME\bin\scripts\ctakes-ytex\data directory. All YTEX database objects will be dropped and recreated. If this is the initial installation, ignore the errors about objects not existing when they are being dropped. If you have installed the UMLS in your database and configured YTEX to use it, YTEX will create a dictionary lookup table with all concepts from the UMLS. The setup speed is dependent on the latency between the machine you are installing on and the database server. This can take several hours.


  • No labels