Page History

Warning
This page is under construction.

Dictionaries

The dictionaries and models used during annotation indeed are the cornerstone of quality for your results. The install instructions show you how to get the resource that you need to run cTAKES. Those resources include:

...

You may not need to use any other dictionaries than those provided in these resources. However, the models made available by cTAKES have been trained on a specific set of text data (a corpus) which may not match well with your text. If you want to build or train your own models, please read the cTAKES 3.0 Component Use Guide, particularly:

Training a sentence detector model
Warning
Locations for training need to be obtained
Training a Part of Speech (POS) tagger model (Building a model Obtaining training data)
Creating a Part of Speech (POS) tag dictionary (Building a tag dictionary)
Training a chunker model (Building a model - Prepare GENIA training data)
Training a dependency parser (Dependency Parser)

Building Your Own Dictionaries

Warning
cTAKES developers need to see if those forum posts still apply to cTAKES 3.0

It is not likely that the UMLS dictionaries will match to your underlying data completely. Other local terms may be required, etc. To install customized dictionaries for RxNorm, SNOMED-CT, or other vocabularies that are available through the UMLS, see the following posts on the cTAKES forums:

https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&t=423
https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&t=80&start=20#p1459
Warning cTAKES developers need to see if those forum posts still apply to cTAKES 3.0

Space shortcuts

Child pages

Versions Compared

Old Version 13

New Version 14

Key

Dictionaries

Building Your Own Dictionaries