Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The langdetect model is built from the OpenNLP data repository in SVN at https://svn.apache.org/repos/bigdata/opennlp/trunk. It would be ideal to automate whatever process is chosen as much as possible to take the models built from that corpus and release them as Maven artifacts. At the time of writing, the langdetect model is the only model available for download but the process chosen should be able to support other types (sentence, token, namefinder, etc.) of models and languages of those models.

Proposed Process

(TODO: Look at how models are built from the corpus repo and see how the built models can be included in artifacts to release.)