Complete, concise instructions to build, train, and run a simple Natural Language Parsing program to tag parts of speech (PoS). Instructions are for Unix, but can easily adapt for Windows.
Save all downloads to $HOME/archives
.
- Download and install Java.
- Download and install Maven.
- Download OpenNLP.
- Download a PoS Treebank into
$HOME/archives/pos
:
- http://www.delph-in.net/erg/
- http://opennlp.sourceforge.net/models-1.5/
- http://opennlp.sourceforge.net/models/english/postag/
- Create development area:
mkdir -p $HOME/dev/java/
- Change to development area:
cd $HOME/dev/java
- Extract files:
tar zxf $HOME/archives/apache-opennlp-*-incubating-src.tar.gz
- Rename directory:
mv apache-opennlp-*-incubating-src opennlp