Rescore Mass-Check
(see RescoreMassCheck310 for the 3.1.x historical page or RescoreMasscheck320 for historical releases)
This is the procedure we use to generate new scores. It takes quite a while and is labour-intensive, so we do it infrequently.
...
Here's the process for generating the scores as of SpamAssassin 3.23.0:
1. heads-up
Inform everyone in advance on the users and dev lists that we will be starting mass-checks shortly, and they should get their corpora nice and clean (see CorpusCleaning) and sign up for RsyncAccounts.
...
No Format |
---|
ssh spamassassin.zones.apache.org cd /home/corpus-rsync OLDVERSION="3.12" sudo mv corpus/submit scoregen-$OLDVERSION sudo mkdir corpus/submit sudo chown rsync corpus/submit sudo gtar cvfz ARCHIVE/scoregen-$OLDVERSION.tgz scoregen-$OLDVERSION |
...
No Format |
---|
svn export http://svn.apache.org/repos/asf/spamassassin/trunk mcsnapshot tar cvfz mcsnapshot.tgz mcsnapshot svn cp \ https://svn.apache.org/repos/asf/spamassassin/trunk \ https://svn.apache.org/repos/asf/spamassassin/tags/3_23_0_mcsnapshot_1 |
(we can't use the standard build process here anymore since the dist tarball no longer includes "masses". Use a descriptive, unique tag name.)
...
RescoreDetails is the full announcement text (and instructions) for this phase. It's sufficient just to send out a mail something like the one we used in 3.1.0previous releases:
No Format |
---|
To: users Cc: dev Subject: NOTICE: 3.23.0 rescoring mass-checks OK, if you're planning to send us mass-check logs for the 3.23.0 rescoring, now's the time! http://wiki.apache.org/spamassassin/RescoreDetails has all the details. cheers! --j. |
...
We may have to tweak the number of months specified for each type, if there's too much or too little mail resulting from the grep. but 38 months / 6 months worked well for 3.23.0.
4.2 tweak rules for evolver
...
No Format |
---|
cd /path/to/checkout/of/trunk svn co \ https://svn.apache.org/repos/asf/spamassassin/tags/3_23_0_mcsnapshot_1/rules \ rules-mcsnapshot cp rules-mcsnapshot/active.list rules/active.list make |
...
See RunningGa. (in the past we used RunningPerceptron, but it acted up during 3.23.0 generation, so we used the GA again.)
...
No Format |
---|
sudo mkdir /home/corpus-rsync/ARCHIVE/3.23.0 sudo mv rescore-logs.tgz /home/corpus-rsync/ARCHIVE/3.23.0/rescore-logs-bug5270bug6155.tgz |
6.5. mark evolved-score rules as 'always published'
...