What does X-Spam-Status mean ?

SpamAssassin adds an extra email header such as

X-Spam-Status: Yes, score=21.6 required=4.0 tests=BATMAIL,BAYES_99,
DATE_IN_FUTURE_06_12,FS_REPLICA,FS_REPLICAWATCH,RAZOR2_CF_RANGE_51_100,
RAZOR2_CF_RANGE_E8_51_100,RAZOR2_CHECK,RCVD_IN_PBL,RCVD_IN_SORBS_DUL,
RDNS_DYNAMIC,URIBL_AB_SURBL,URIBL_BLACK,URIBL_JP_SURBL,URIBL_WS_SURBL
autolearn=spam version=3.2.1

These are visible when the a mail client is configured to "show full headers", or a similar option.

The fields are the following:

Whether the message is spam (yes/no)
The total score for the message (can be negative if whitelisted)
The score that would be required to be classed as spam
The comma-separated list of tests that returned non-zero value
Whether autolearn learned the message as spam or ham
The version of SpamAssassin that was used

To find out why a message was classified as spam (got the score it did), check the tests listed.

You can find all of the currently active rules with descriptions and scores in the Subversion repository under /trunk/rules or by downloading the latest published set using the sa-update tool.

If a message is classed as spam, by default it is attached to a text message that gives the tests failed, points assigned, and short descriptions

How are the scores assigned?

The scores are assigned using a neural network trained with error back propagation (Perceptron). Both systems attempt to optimize the efficiency of the rules that are run in terms of minimizing the number of false positives and false negatives.

A list of the rules and their assigned scores is at tests.

You can help this system by providing statistics on your mail spool via NightlyMassCheck and RescoreMassCheck.

Confusing scores

Scores for "learn" rules (example the various BAYES_?? rules) are scored using the same method. This can produce scores which seem incorrect (example BAYES_80 with a higher score than BAYES_99). This is due to the fact that rules are not related to one another, they're separate rules have separate scores.

Messages with high probability from a "learn" rule will most likely match other rules. This lets the score generation system lower the "learn" rule score preventing false positives. The message still is recognized as spam due to the sum of all rule scores.

Some DNS blacklist rules are distributed with scores of 0. These generally request or require payment are disabled by default. Feel free to enable the lookups, if you've paid for them.

A score of 0 will stop a rule from being run.

In version 2.x, the scores are assigned using a genetic algorithm (GA).

Child pages

Versions Compared

Old Version 10

New Version Current

Key

What does X-Spam-Status mean ?

How are the scores assigned?

Confusing scores

Child pages

Page History

Versions Compared

Old Version 10

New Version Current

Key

What does X-Spam-Status mean ?

How are the scores assigned?

Confusing scores