Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction and Goal

Now that OpenNLP has a langdetect model available for download it would be useful to distribute this model as a Maven dependency. Having the model available as a Maven dependency can make the model easier to acquire, use, and promote OpenNLP.  Any work done for this task is captured by the task 

Jira
serverASF JIRA
serverId5aa69414-a9e9-3523-82ec-879b028fb15b
keyOPENNLP-1164
.

The langdetect model is built from the OpenNLP data repository in SVN at https://svn.apache.org/repos/bigdata/opennlp/trunk. It would be ideal to automate whatever process is chosen as much as possible to take the models built from that corpus and release them as Maven artifacts. At the time of writing, the langdetect model is the only model available for download but the process chosen should be able to support other types (sentence, token, namefinder, etc.) of models and languages of those models.

Proposed Process

(TODO: Look at how models are built from the corpus repo and see how the built models can be included in artifacts to release.)