...
Component | Model | Perf 1.5.1 | Perf 1.5.2 | Tester | Passed | Comment |
---|---|---|---|---|---|---|
Sentence Detector | en-sent.bin | 42186.7 sent/s |
| joern | no | It did not pass because of OPENNLP-202. |
Tokenizer | en-token.bin | 3091.8 sent/s | 2300.4 sent/s | joern | yes |
|
Name Finder | en-ner-person.bin | 614.4 sent/s | 650.6 sent/s | joern | yes | output identical, measurement was done on a idle system, |
POS Tagger | en-pos-maxent.bin | 732.1 sent/s | 816.9 sent/s | joern | yes |
|
POS Tagger | en-pos-perceptron.bin | 1110.6 sent/s |
| joern | no | Perceptron normalization was changed. |
Chunker | en-chunker.bin | 167,3 sent/s | 166.4 sent/s | joern | yes |
|
Parser | en-parser-chunking.bin | 11.6 sent/s |
| joern | no | A very few sentences are parsed differently due to OPENNLP-233. |
...
Component | Data | Tester | Tagging Perf 1.5.1 | Tagging Perf 1.5.2 | Comment | |
---|---|---|---|---|---|---|
Sentence Detector |
|
|
|
|
| |
Tokenizer |
|
|
|
|
| |
Name Finder | CONLL 2002 Dutch Person ned.testa | jkosin | Precision: 0.7906976744186046 | Precision: 0.7570754716981132 7552941176470588 | Performance Change due to OPENNLP-294 and more... | |
Name Finder | CONLL 2002 Dutch Person ned.testb | jkosin | Precision: 0.8527980535279805 | Precision: 0.8479899497487438 8505025125628141 |
| |
Name Finder | CONLL 2002 Dutch Organization ned.testa | jkosin | Precision: 0.8386075949367089 | Precision: 0.8561872909698997 |
| |
Name Finder | CONLL 2002 Dutch Organization ned.testb | jkosin | Precision: 0.7784200385356455 | Precision: 0.783203125 7830374753451677 |
| |
Name Finder | CONLL 2002 Dutch Location ned.testa | jkosin | Precision: 0.8362831858407079 | Precision: 0.8427947598253275 8458333333333333 |
| |
Name Finder | CONLL 2002 Dutch Location ned.testb | jkosin | Precision: 0.854251012145749 | Precision: 0.8827160493827161 8816326530612245 |
| |
Name Finder | CONLL 2002 Dutch Misc ned.testa | jkosin | Precision: 0.8300492610837439 | Precision: 0.8354114713216958 |
| |
Name Finder | CONLL 2002 Dutch Misc ned.testb | jkosin | Precision: 0.8373205741626795 | Precision: 0.8267716535433071 8264984227129337 |
| |
Name Finder | CONLL 2002 Combined ned.testa | jkosin | Precision: 0.7906976744186046 | Precision: 0.7570754716981132 6509695290858726 |
| 1000 iterations |
Name Finder | CONLL 2002 Dutch Combined ned.testb | jkosin | Precision: 0.8527980535279805 | Precision: 0.8479899497487438 6869929337869668 | 1000 iterations | |
Name Finder | CONLL 2002 Spanish Person esp.testa | jkosin | Precision: 0.8982630272952854 | Precision: 0.9038718291054739 9010695187165776 |
| |
Name Finder | CONLL 2002 Spanish Person esp.testb | jkosin | Precision: 0.9008 | Precision: 0.9063545150501672 9195205479452054 |
| |
Name Finder | CONLL 2002 Spanish Organization esp.testa | jkosin | Precision: 0.8216258879242304 | Precision: 0.8292880258899676 8288942695722357 |
| |
Name Finder | CONLL 2002 Spanish Organization esp.testb | jkosin | Precision: 0.8009331259720062 | Precision: 0.8031496062992126 8036277602523659 |
| |
Name Finder | CONLL 2002 Spanish Location esp.testa | jkosin | Precision: 0.7481789802289281 | Precision: 0.7754189944134078 7743016759776536 |
| |
Name Finder | CONLL 2002 Spanish Location esp.testb | jkosin | Precision: 0.8226221079691517 | Precision: 0.8360433604336044 8301886792452831 |
| |
Name Finder | CONLL 2002 Spanish Misc esp.testa | jkosin | Precision: 0.6446886446886447 | Precision: 0.6308411214953271 6492890995260664 |
| |
Name Finder | CONLL 2002 Spanish Misc esp.testb | jkosin | Precision: 0.6595744680851063 | Precision: 0.6763005780346821 686046511627907 |
| |
Name Finder | CONLL 2002 Spanish Combined esp.testa | jkosin | Precision: 0.8982630272952854 | Precision: 0.9038718291054739 7005423249233671 | 1000 iterations | |
Name Finder | CONLL 2002 Spanish Combined esp.testb | jkosin | Precision: 0.9008 | Precision: 0.9063545150501672 756635931824532 | 1000 iterations | |
Name Finder | CONLL 2003 English Person eng.testa | jkosin | Precision: 0.9352201257861635 | Precision: 0.9523195876288659 |
| |
Name Finder | CONLL 2003 English Person eng.testb | jkosin | Precision: 0.8873546511627907 | Precision: 0.9391727493917275 |
| |
Name Finder | CONLL 2003 English Organization eng.testa | jkosin | Precision: 0.8528584817244611 | Precision: 0.8768046198267565 |
| |
Name Finder | CONLL 2003 English Organization eng.testb | jkosin | Precision: 0.8263422818791947 | Precision: 0.8435980551053485 |
| |
Name Finder | CONLL 2003 English Location eng.testa | jkosin | Precision: 0.9283837056504599 | Precision: 0.9361421988150099 |
| |
Name Finder | CONLL 2003 English Location eng.testb | jkosin | Precision: 0.9156180606957809 | Precision: 0.9206349206349206 |
| |
Name Finder | CONLL 2003 English Misc eng.testa | jkosin | Precision: 0.8539007092198582 | Precision: 0.9027982326951399 |
| |
Name Finder | CONLL 2003 English Misc eng.testb | jkosin | Precision: 0.8599137931034483 | Precision: 0.8592436974789915 |
| |
Name Finder | CONLL 2003 English Combined eng.testa | jkosin | Precision: 0.8601818493738206 | Precision: 0.861812521618817 | 1000 iterations | |
Name Finder | CONLL 2003 English Combined eng.testb | jkosin | Precision: 0.8036415565869333 | Precision: 0.8041311831853597 | 1000 iterations | |
Name Finder | CONLL 2003 German Person deu.testa | joern | Precision: 0.8602620087336245 | Precision: 0.9132653061224489 |
| |
Name Finder | CONLL 2003 German Person deu.testb | joern | Precision: 0.878 | Precision: 0.8732106339468303 |
| |
Name Finder | CONLL 2003 German Organization deu.testa | joern | Precision: 0.8365695792880259 | Precision: 0.8407224958949097 |
| |
Name Finder | CONLL 2003 German Organization deu.testb | joern | Precision: 0.7942583732057417 | Precision: 0.8014705882352942 |
| |
Name Finder | CONLL 2003 German Location deu.testa | joern | Precision: 0.7362637362637363 | Precision: 0.7816326530612245 |
| |
Name Finder | CONLL 2003 German Location deu.testb | joern | Precision: 0.75 | Precision: 0.8033826638477801 |
| |
Name Finder | CONLL 2003 German Misc deu.testa | joern | Precision: 0.7213930348258707 | Precision: 0.7055555555555556 |
| |
Name Finder | CONLL 2003 German Misc deu.testb | joern | Precision: 0.6198830409356725 | Precision: 0.6601307189542484 |
| |
Name Finder | CONLL 2003 German Combined deu.testa | joern | Precision: 0.7675205413243112 | Precision: 0.7718859429714857 |
| |
Name Finder | CONLL 2003 German Combined deu.testb | joern | Precision: 0.7553418803418803 | Precision: 0.7467566165023353 |
| |
POS Tagger | CONLL 2006 Danish | joern | Accuracy: 0.9511278195488722 | Accuracy: 0.9511278195488722 |
| |
POS Tagger | CONLL 2006 Dutch | joern | Accuracy: 0.9324977618621307 | Accuracy: 0.9324977618621307 |
| |
POS Tagger | CONLL 2006 Portuguese | joern | Accuracy: 0.9659110277825124 | Accuracy: 0.9659110277825124 |
| |
POS Tagger | CONLL 2006 Swedish | joern | Accuracy: 0.9275106082036775 | Accuracy: 0.9275106082036775 |
| |
Chunker | CONLL 2000 | colen | Precision: 0.9255923572240226 | Precision: 0.9257575757575758 | Perf change due to OPENNLP-242 | |
Chunker | Arvores Deitadas | colen | Precision: 0.9413606010016694 | Precision: 0.9403445830378374 | Perf change due to OPENNLP-242 and OPENNLP-186 |
The results of the tagging performance might differ compared to the
1.5.0 release since a precision bug in the calculation of the score has been fixed:
https://issues.apache.org/jira/browse/OPENNLP-59
The results of the tagging performance may differ compared to the 1.5.1 release, since a bug was corrected in the event filtering.
(TODO: put jira issue here) A problem was corrected for the CoNLL 02 data being improperly converted to the wrong encoding.
Test UIMA Integration
The test ensures that the Analysis Engine can run and not not
crash trough simple runtime time code errors. We need to add
more sophisticated testing with the next releases.
Analysis Engine | Tester | Passed | Comment |
---|---|---|---|
Sentence Detector | joern | yes |
|
Sentence Detector Trainer | joern | yes | Trained with a UIMA pipeline |
Tokenizer ME | joern | yes |
|
Tokenizer Trainer | joern | yes | Trained with a UIMA pipeline |
Name Finder | joern | yes |
|
Name Finder Trainer | joern | yes | Trained with a UIMA pipeline |
Chunker | joern | yes | as part of sample pear |
Chunker Trainer |
|
|
|
POS Tagger | joern | yes | as part of sample pear |
POS Tagger Trainer |
|
| Trained and tested with cmd line tool |
Parser |
|
|
|
createPear.sh | joern | yes |
|
Sample PEAR | joern | yes | installed and run over sample text |
...
Package | File or Test | Tester | Passed | Comment |
---|---|---|---|---|
Binary | LICENSE | joern | yes | AL 2.0 and BSD for JWNL |
Binary | NOTICE | joern | yes | standard notice, dates are correct. JWNL is mentioned |
Binary | README | colen, jason, james, joern | yes | File was reviewed on the dev list. |
Binary | RELEASE_NOTES.html | joern, james | yes | issue list is generated correctly |
Binary | Test signatures: .md5, .sha1, .asc | joern | yes | rc4 |
Binary | JIRA issue list created | joern | yes |
|
Binary | Contains maxent, tools, uima and jwnl jars | joern | yes |
|
Source | LICENSE | joern | yes | standard AL 2.0 file |
Source | NOTICE | joern | yes | standard notice, dates are correct |
Source | Test signatures: .md5, .sha1, .asc | joern | yes | rc4 |
Source | Can build from source? | joern | yes | Test should be done without jwnl and opennlp in local m2 repo. |