Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Moved Corpus-Nightly section to separate page

...

How? (Less Easy, The Corpus-Nightly Script)

The corpus-nightly script in the masses/rule-qa/ directory of the SpamAssassin tree can be used to set up a mass-checker on your mail. Here's a step-by-step account of the process.

First off, you'll also need to ask for RsyncAccounts and make sure you get a "nightly" account rather than a release-time account. You also need to install Subversion to get the "svn" command.

Then run:

No Format

mkdir $HOME/nightlymc $HOME/nightlymc/tmp
cd $HOME/nightlymc
svn co http://svn.apache.org/repos/asf/spamassassin/trunk
cp trunk/masses/rule-qa/corpus.example ~/.corpus

Edit '~/.corpus' to have values something like this, replacing /home/jm with whatever your own $HOME is.

No Format

vi ~/.corpus
# temporary working directory for summary results
tmp=/home/jm/nightlymc/tmp

# subversion directory location
# [this is the directory you have already checked out!]
tree=/home/jm/nightlymc/trunk

# rsync username and password (see RsyncAccounts)
username=jm
password=xyzzy

# weekly and nightly mass-check options
opts_weekly="--restart=500 --tail=15000 --net -j 8 -f /home/jm/nightlymc/targets"
opts_nightly="--restart=500 --tail=15000 -f /home/jm/nightlymc/targets"

# weekly and nightly mass-check user_prefs files
prefs_weekly=/home/jm/nightlymc/user_prefs.weekly
prefs_nightly=/home/jm/nightlymc/user_prefs.nightly

Now, create those two user_prefs files. Here's suggested (basic) settings:

user_prefs.nightly:

No Format

use_bayes 0
use_auto_whitelist 0
internal_networks 127/8
trusted_networks 127/8

I suggest just "cp"'ing that file to user_prefs.weekly as well, but if you wanted different settings to control network rules, go ahead. It might make sense to extend those with full trusted-networks data, if you like.

Edit ~/nightlymc/targets:

No Format

ham:detect:/local/cor/recent/ham/*
spam:detect:/local/cor/recent/spam/*

That's it – now run
bash /home/jm/nightlymc/trunk/masses/rule-qa/corpus-nightly and watch as it starts mass-checking. Once you're happy enough with it, set that command to run in cron.

Note: the best time to run a mass-check is as soon as possible after 0900 UTC. Daylight savings time in some local timezones can be troublesome, so the script will adjust for this by sleeping for an hour if it detects that it was started in the 0800 UTC hour period, so you no longer have to worry about that. CorpusNightlyScript

How? (For Hackers, The DIY Version)

...