Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: [Original edit by JustinMason] think we implemented the bayes one already

...

  • Code and corpus tests that for ramping up the probability for previously unseen tokens. This could be done either heuristically or by keeping real counts of unseen tokens in the Bayes token database. The idea is that words that have never been learned before get high probabilities.
  • Custom database file and code for faster performance and space savings (probably to be compared against qdbm and tdb since they look most promising right now as non-custom databases).
  • Bi-grams: that is, multi-word windowing as used in CRM-114, using two-word tokens (or possibly even higher). Not sure this will provide much higher accuracy now that spammers are using whole-phrase bayes poisoning, though. (JustinMason)
  • Wiki Markup
    Implementing Dobly noise-reduction - \[http://bugzilla.spamassassin.org/show_bug.cgi?id=3078 bug 3078\].
    \\
  • Wiki Markup
    Dynamically determining the autolearning thresholds based on incoming email rather than using hard-coded numbers.  See \[http://bugzilla.spamassassin.org/show_bug.cgi?id=1829 bug 1829\] for more.
    \\
  • Wiki Markup
    Looking for specific header tokens when they change location between the original message and the reply.  See \[http://bugzilla.spamassassin.org/show_bug.cgi?id=2129 bug 2129\] for more.
    \\

Other ideas

  • Translation : translation of rule descriptions, the manual, the website in other languages
  • Wiki Markup
    Feedback button : client side button to enable a one touch feedback for users to recategorized a message (false positive or negative to correct state).  This is a joint project between Spamassassin and Camram \[http://www.camram.org\].  It would also be usable by other server resident anti-spam systems.  Contact [EricJohansson] for more details
    \\
  • Wiki Markup
    \[http://bugzilla.spamassassin.org/showdependencytree.cgi?id=4560 Bugs marked as 'nice to have'\]; general feature requests that aren't urgent, but would be cool to have
    \\

...

CategoryFaq CategoryDevelopment