Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

How do I use SpamAssassin with procmail to forward mail and to do mistake-based Bayes training?

This procmail script is designed for people who have their mail forwarded through a server (e.g, example.com) but then read their mail on a non-publicized account on a different server (e.g., privateaddress@example.net). This is quite common for folks who have a vanity domain name but then read their mail through an office Exchange server, home DSL email account, etc. The idea is for procmail on the first server to run each message through SpamAssassin, and then forward the message on to the private address.

The trick for Bayes training is to add some extra procmail rules to specify special processing for training messages. The following is based on having a catchall address for all mail sent to example.com, so I can trigger the bayes training by sending mail to spam@example.com and ham@example.com. It is left as an exercise for the reader to create an alternative script that triggers based on a passphrase added to the subject, and uses formail to remove that passphrase before passing the message to sa-learn.

Wiki Markup
Note that this setup still only works passably with Outlook and Exchange, because even resending the message causes a new Message-ID header to be created and the old Received headers to be lost. Other headers are still carried over. To trigger Bayes learning from Outlook on false negatives, choose Action: Resend this Message (you have to remove any From and CC headings and change the To field to spam@example.com). Note that nearly every other mailer (except for AOL) supports real redirects; see \[http://www.stearns.org/doc/spamassassin-setup.current.html#redirect\].

Note that after logging into the server, you can find the path for spamassassin by typing which spamassassin.

false negatives (i.e., spam that SpamAssassin didn't catch). This script uses mistake-based training for false negatives. That is, it assumes that SpamAssassin can correctly autolearn on enough ham and spam to seed the Bayes database. Then, when SpamAssassin incorrectly marks a spam message as not spam, the user can train the database by redirecting the message to be learned as spam. Although a similar redirection scheme could be used to train on false positives (i.e., legimate mail incorrectly seen as spam), it's likely more effective to just ManualWhitelist mail from that legitimate sender.

The following is based on having at least two addresses (publicaddress@example.com and spam@example.com) trigger the same procmail script. In most vanity domain setups, all addresses are processed by the same procmail script. The script needs to be edited to include your real addresses and domain.

No Format

No Format


#Uncomment the following lines and use tail -f procmail.log to debug
#LOGFILE=$HOME/procmail.log
#VERBOSE=yes
#LOGABSTRACT=all

# Feed redirected spam to sa-learn, and also store a copy in a folder called spam.
# This folder of false negatives could be useful if we needed to rebuild our Bayes
# database in the future.

:0
* ^To:.*spam@example.com
*
  < 256000{

   * < {256000
   :0c: spamassassin.spamlock
   | sa-learn --spam

   :0: spamassassin.filelock
   mail/spam
   }

# Send all other mail through SpamAssassin

:0fw: spamassassin.lock
* < 256000
| /usr/bin/spamassassin


# Mail that is very likely spam (>10>15) can be saved on the server
# (not forwarded), ofor by moving the # down one line, even dropped
# on the floor.  Note that dropping mail on the floor is a *bad*
# idea unless you really, really believe no false positives will
# have a score greatgreater than 1015.  If you want all mail forwarded,
# just add #'s in front of each of these lines:

:0: spamassassin.filelock2
* ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
#/dev/null
mail/tenplusspamalmost-certainly-spam


:0# #Forward Allall spammail with a score less than 10 15 to my non-publicized address
:0
! privateaddress@example.net # Forward mail to my non-publicized address


This file is available procmailrc.forward.txt. If you don't currently have a procmail file, you can import this one by entering:

No Format

wget http://wiki.apache.org/spamassassin-data/attachments/ProcmailToForwardMail/attachments/procmailrc.forward.txt
mv procmailrc.forward.txt .procmailrc

On your mail client, you'd then likely want to filter mail with a score of 5 or higher (i.e., where "X-Spam-Level: *****") into a Likely Spam folder. False Positives rarely score higher than 15. The advantage of leaving mail with a score of 15 or higher on the server is that it makes it easier to find false positives in the Likely Spam folder without being overwhelmed by hundreds of obvious spam. You can then ManualWhitelist those false positives.

For the mistake-based training, it's critical to redirect (or bounce) the message, rather than forwarding. Forwarding loses all of the critical header information, which is much of what Bayes trains off of. Here are directions for redirecting from different clients

  • AOL's integrated email client: Redirecting mail is not available. (Dave Goldsmith)
  • Eudora: Select the message, go to the "Message" menu, choose redirect, fill in the address, and choose send. (Brian Corcoran and Erik Wheeler)
  • Evolution: Select the message. In the "Actions" menu, choose the "Forward" submenu (not "Forward message", the "Forward" submenu). Pick "Redirect", fill in the "To" field, and press "Send". (Johannes Ullrich)
  • OS/X Mail.app: With the email message open or selected, go to Mail's 'Message' menu and select 'Bounce to sender' or 'Redirect'. If you use this frequently, go to the "View" menu, choose "Customize toolbar", and add a button for "Redirect". (Marion Bates)

Microsoft Outlook 97 Double-click on the message so it opens in a new window. Click on Tools->Resend This Message. A warning will appear about you not being the original sender of the message. Click Yes. A message window appears. Update the To: field and click on 'Send'. (Dave Goldsmith) Microsoft Outlook 2000 Double-click on the message so it opens in a new window. Click on Actions->Resend This Message. A warning will appear about you not being the original sender of the message. Click Yes. A message window appears. Update the To: field and click on 'Send'. (Dave Goldsmith) Microsoft Outlook Express It does not appear to have a redirect option. (Dave Goldsmith and Alex Bates) Netscape Communicator 4.x and 7.x They don't appear to have a redirect option. Pine
For a single message, highlight the message and press "b" to bounce it. Enter the target address and press enter. For multiple messages, select all the messages you'd like to bounce with either ":" to select them one at a time, or ";" to select multiple messages by message number, subject, body text, etc. Once selected, press "a", then "b" to Apply the Bounce command to all of them. Enter the target email address. Once done, press ";", then "a" to Unselect All selected messages.

More info can be found at: http://www.itc.virginia.edu/desktop/email/pine/bounce.html

Sylpheed Click on the message, go to the "Message" menu, choose "Redirect", fill in the "To:" address, and press send. Alternately, right click in the message and choose "Redirect" from the popup menu, fill in the "To:" address, and press send. (Dave Goldsmith)

...

from. See ResendingMailWithHeaders for details of how to do this.

Step-by-step instructions

Way more detail on how to do this is at SingleUserUnixInstall.

Other training options

An even easier form of mistake-based training is to use IMAP and create a Learn{{`As}}`Spam folder, as described in the IMAP section of SingleUserUnixInstall.

Contributors

...