How do I get SpamAssassin to run faster?
General Advice
Use [http://www.spamassassin.org/full/3.0.x/dist/spamd/README spamd].
Ensure you are not using a locale that uses UTF-8; UTF-8 character sets have higher overhead for text-processing applications: Utf8Performance
If you are using network tests, install a local DNS server (BIND named, for example) on the same host to cache responses, and set the /etc/resolv.conf file to use that instead of one on another machine.
Examine the custom rule set files you use.
- Avoid large rule sets, those over 100k or 150k in size. The more rules you have, the slower SA will run.
In particular, the
sa-blacklist
andsa-blacklist-uri
rulesets are extremely heavyweight, and greatly affect performance. If you're using these, and running into performance issues, remove them immediately. (Use network tests instead, since they are there as URIBL_WS_SURBL. See [http://wiki.apache.org/spamassassin/OutOfMemoryProblems#head-198fc106917f358aea90b95047299e4de6c0443d OutOfMemoryProblems].)- Pick rule set files that are more productive. In the SARE families published by Bob Menschel, use files 0 and 1 for productivity / efficiency, and avoid files 2 and 3.
- Remove and re-add rule set files one at a time, and check performance after each change. If one rule set file causes a huge change in performance, take appropriate action.
Examine the custom rules you create, or have downloaded from third parties. Poorly-written regular expressions can use resources exponentially. Avoid body, rawbody, or full rules that use +
or *
quantifiers.
If you are memory-bound
If the spamd processes are eating up all the RAM on your machine, then you are memory-bound.
Are you experiencing high system load or possibly swapping? Look at the number of children you have spawned, and compare that to the available memory (by default each child can use 20-30 megs of RAM). Depending on load you might find success in lowering the number of children that are spawned (see -m in the spamd documentation).
Consider turning off network tests, and running with "-L", if you can afford a drop in accuracy. This is not a very good option for most people though, and while it will reduce system memory load by reducing the number of simultaneous processes, it will increase system CPU load, so be warned! See NetworkTestsLatency for more info.
See OutOfMemoryProblems.
If you are I/O-bound
For heavily loaded servers, you may be experiencing high iowait times depending on how hard you are hitting your disk. You can try offloading the logging and bayes disk writes to a separate disk, or even disabling Bayes rules entirely with use_bayes 0
.
If you are CPU-bound
If your server is being limited by CPU load:
Remove custom rule sets, as detailed above. Seriously.
See OutOfMemoryProblems. Much of the advice applies for CPU-bound machines, too.