Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: missing edit-log entry for this revision

Okay, SpamAssassin developers are looking for people to volunteer and make code contributions.

Wiki Markup
Patches, code, perl, regression tests, rules, you get the picture.  You'll have to send in a \[http://www.apache.org/licenses/#clas Contributor License Agreement\] before it can be accepted, but that's easy.

So, what are we looking for right now?

The Top 10 items

Speed

  • Submit code to speed something up without breaking anything.

Size

  • auto-whitelist and bayes_seen databases need to have expiry.

Bayes accuracy and speed

  • Code and corpus tests that for ramping up the probability for previously unseen tokens. This could be done either heuristically or by keeping real counts of unseen tokens in the Bayes token database. The idea is that words that have never been learned before get high probabilities.
  • Wiki Markup
    Looking for specific header tokens when they change location between the original message and the reply.  See \[http://bugzilla.spamassassin.org/show_bug.cgi?id=2129 bug 2129\] for more.
    \\
  • Wiki Markup
    Dynamically determining the autolearning thresholds based on incoming email rather than using hard-coded numbers.  See \[http://bugzilla.spamassassin.org/show_bug.cgi?id=1829 bug 1829\] for more.
    \\
  • Bi-grams.
  • Custom database file and code for faster performance and space savings (probably to be compared against qdbm and tdb since they look most promising right now as non-custom databases).

Other ideas