Nightly Mass-Check Runs
Nightly MassCheck runs are currently the primary vehicle for evaluating the quality of rules checked into SpamAssassin. Every night contributors check out a specific revision of SpamAssassin from SVN and run MassCheck on their corpora. They upload their MassCheck logs to an rsync server, where the RuleQaApp analyses the logs.
(There's also an older, clunkier version of the analysis scripts running on DanielQuinlan's server; see .)
The corpus-nightly script in the masses/ directory of the SpamAssassin tree can be used to set this up. It's probably not very well documented, (WeLoveVolunteers), but it should work. You'll also need to ask for RsyncAccounts and make sure you get a "nightly" account rather than a release-time account.
How? (in more detail)
Get ahold of$VERS-versions.txt, where
$VERS is either "nightly" or "weekly". "nightly" is updated a little before 0900 UTC Sunday through Friday. "weekly" is updated at the same time on Saturdays, and is meant to be a net-enabled run. ie: wait until at least 0900 UTC before trying to do a corpus run. The above files are also available via the standard rsync system.
The format of the above files is a file of "date <tab> revision <LF>", date in YYYY-MM-DD format, revision being the value that comes out of SVN. New lines are added to the bottom of the file.
So... Grab the file, find the right line (you can either grep for the date, or just take the last line of the file), and use the second column to update your corpora version. ie:
REV=`tail -1 nightly.txt | awk '{print $2}'` cd /path/to/spamassassin-checkout svn update -r $REV
Alternatively, if you don't have Subversion set up, and would prefer to pick it up via rsync:
rsync -vrz --delete \ rsync:// .
(replace "nightly" with "weekly" for the weekly builds.)
Then use that build of SpamAssassin to perform a MassCheck , and when that completes, upload the results as per the instructions in .
(-- pasted by JustinMason from a mail from TheoVanDinter)