You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 23 Next »

Instructions to train and run a simple parts of speech (PoS) tagger program. Instructions are for Unix, but adaptable for Windows.

Unless otherwise specified, save downloads to $HOME/archives.

  1. Download and install Java.
  2. Download and install Maven.
  3. Download OpenNLP.
  4. Download a PoS Treebank training set into $HOME/archives/pos.
  5. Create development area:
    mkdir -p $HOME/dev/java/
  6. Change to development area:
    cd $HOME/dev/java
  7. Extract files:
    tar zxf $HOME/archives/apache-opennlp-*-incubating-src.tar.gz
  8. Rename directory:
    mv apache-opennlp-*-incubating-src opennlp
  9. Build Java Archive (JAR) files:
    cd opennlp/opennlp
    mvn install
  • No labels