Rescore Mass-Check
(see RescoreMassCheck310 or RescoreMasscheck320 RescoreMassCheck320 for historical releases)
This is the procedure we use to generate new scores. It takes quite a while and is labour-intensive, so we do it infrequently.
...
No Format |
---|
masses/enable-all-evolved-rules < rules/50_scores.cf \
> rules/51_newscores.cf
mv rules/51_newscores.cf rules/50_scores.cf
svn diff [and ensure it looks sane]
svn commit [create a new bug attachment for review if in R-T-C mode]
|
Copy the nightly-log-submission rsync accounts to the rescore-log-submission accounts (see RsyncConfig) (not clear why we don't just use one set of accounts here, but hey):
No Format |
---|
ssh spamassassin.zones.apache.org sudo cp /home/corpus-rsync/secrets /home/corpus-rsync/secrets-submit |
Move the old rescore logs from the previous release (if they're still around) to the archives:
...
We then take the log files rsync'd up to the server, and use those logs for all 4 score sets. The initial logs are for score set 3 (the fourth), sets 0, 1, and 2 can be generated from set 4 by stripping out the network tests and/or the Bayes tests.
4.05. publish logs to ruleqa site
This will make the mass-check results visible on http://ruleqa.spamassassin.org/ (under the appropriate DateRev), using usernames starting with "rescore-". TODO: this doesn't include filtering out too-old logs (see below), so won't necessarily match the freqs produced later.
No Format |
---|
ssh spamassassin2.zones.apache.org cd /export/home/corpus-rsync/corpus echo '# mass-check results from someone@rescore, on Tue Sep 30 09:00:00 UTC 2009 # M:SA version 3.3.0-alpha3-r808953 # SVN revision: 808953 # Date: 20090930T090000Z #' > /tmp/hdr for f in submit/*.log ; do i=`echo $f | sed -e 's,^submit/,,' -e 's/^\(.*am\)-bayes-net-\([^\.]*\.log\)$/\1-rescore-\2/'`; echo "$f => $i" ; sudo touch tmpf ; sudo chmod 666 tmpf; cat < /tmp/hdr > tmpf; sed -e '/^#/d' < $f >> tmpf; sudo chmod 644 tmpf; sudo mv tmpf $i ; sudo chown rsync $i; done |
4.1. filter out too-old logs
No Format |
---|
ssh spamassassin.zones.apache.org cd /home/jm/ftp/spamassassin/masses [or wherever] ./log-grep-recent -m 3872 /home/corpus-rsync/corpus/submit/ham-*.log > ham-full.log ./log-grep-recent -m 62 /home/corpus-rsync/corpus/submit/spam-*.log > spam-full.log |
...