Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: [Original edit by JohnHardin] Improvements/clarifications

...

No Format
bayes_path /var/spamassassin/bayes_db/bayes
bayes_file_mode 0777

Note that the argument to bayes_path is a combination of a directory (/var/spamassassin/bayes_db/) and a filename prefix (bayes).

This tells the system that the Bayesian filter database files will be /var/spamassassin/bayes_db/bayes_msgcount, seen and _toks. Feel free to move it the database wherever you want. Please note this directory needs to be RWX to for all users that SpamAssassin will be executed as, or R-X if autolearning and automatic expiry are disabled; many use world RWX to simplify this, but this is insecure and not recommended. The directory also shouldn't contain any files other than your bayes database. If it contains any other files that start with "bayes" (or whatever other filename prefix you specified) it can break the database locking mechanisms SpamAssassin uses.

...

No Format
sa-learn --spam --showdots --dir /path/to/directory/full/of/spam/msgs
sa-learn --ham --showdots --dir /path/to/directory/full/of/ham/msgs

Do not simply use your inbox to train Bayes! The mailboxes of ham and spam messages used for training should be hand-verified, and should be kept after the initial training in case retraining is ever needed to correct problems with Bayes. It is safe to run sa-learn against the same mailbox multiple times, as a given message will only be learned once (unless its classification as ham or spam has changed).

See SiteWideBayesFeedback for more tips on getting an entire site to feed back spam and ham messages into the Bayesian filter.

...

Your method of restarting spamd options may differ, but the above is typical. If you're using any MTA integrations that invoke SpamAssassin as a perl API (ie: i.e. Amavis, MailScanner or mimedefang), that process will need to be restarted or told to reload its configuration as it is effectively it's own spamd.

Restarting spamd/Amavis/MailScanner/mimedefang is not needed after maintenance training or a background expiry, just when you enable or disable bayes.

You may experience difficulties with file permissions. Make sure you chmod any existing bayes files to readable/writable by your user groups (or world if you're doing so).

If you are going to use group rights instead of a world RWX, there are some additional issues you will need consider. If you use spamd and mail gets scanned on behalf of "root" spamd will use "nobody" as its effective user for bayes database access. You should consider this user when planing planning your group memberships. Also, be aware that the files are deleted and recreated by whatever user happens to be running spamassassin when an expiration is due. If you are not using world RWX this means you need to beware be aware the files will loose their lose any group ownership you may have set unless you make the directory setgid.

...