Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: update for 3.2

Release goals for 3.12.

High Level-Goals

  • (tick) lower resource usage: higher throughput and lower memory usage
  • higher accuracy: lower FPs and lower FNs (rules, rules, rules... this also includes some notion of speeding up the mass-check process)
  • (tick) convert optional/non-performance-sensitive code to plugins (I think this is lower priority, but we've often talked about it and it also helps achieve the first goal of lower resource usage)

High Level Anti-Goals

  • features: extra options, non-critical changes not related to the above goals, etc. (except perhaps in plugins)
  • option bloat (except perhaps in plugins)

Basic reorganization of rules

  • SVN external for rules (using "if" for version-specific rules)
  • separate 50_scores.cf into input and output files
  • traversing configuration directories

Memory Usage

We should probably evolve some understanding of what we want to convert to plugins. Here's the list:

  • (tick) Razor
  • (tick) DCC
  • (tick) Pyzor
  • (tick) SpamCop reporting
  • (partial) nuke AWL and replace finish transition away from AWL and finish replacing with "History" plugin including backwards compatibility
  • plugin-ize Bayes
  • move user preference configuration code to plugins
  • plugin-ize whitelisting (tick) TextCat

Performance/Speed

  • Predictive autolearn? do check before bayes_check, if we are likely to autolearn, go r/w instead of r/o. Can implement on first bayes_check call.
  • Don't bother caching full/decoded/etc at start in PMS. how much caching do we do now? multiple times in PMS? may not be an issue due to references.
  • short circuiting ideas:
    • set certain rules as SC if hit
      • USER_IN_WHITELIST, USER_IN_BLACKLIST (not DEF)
      • BSP
      • HABEAS
    • allow SC on ham score (ie: < #)
    • allow SC on spam score (ie: > #)
    • should autolearn skip SC msgs? should we always do autolearn in the appropriate direction?
    • AWL should be skipped during SC
    • SC rules should have a negative priority so they run first
    • do *not* do score check per rule, do it either per priority or rule type (header, body, etc.)
    • SC will require is_spam SC as score + required_hits will be at odds
    • add SC header macro (get_tag)
    • SC for S/O 1.000 rules? how about S/O near 1? BAYES_99, etc.
    • Some form of order/priority rearrangement:
    Single-cycle mass-check
    • (tick) Add sample-based "autolearning" to mass-check
      • Blacklist: short
      • Whitelist: user/admin wants it
      • BSP/Habeas: reputable, non-forgable
      • Other SC Rules: as early as possible
      • Other Local Rules: lightweight

Speed Release Cycle


    • (tick) One run with network and bayes turned on
    • Related, but non-required change to autolearning: the balancing of in and out (accuracy)

Accuracy Ideas

  • network test, do DNS lookups on the HELO (A, NS, and SURBL)
  • network test, do DNS lookups on the EnvelopeFrom (SURBL)

Uncategorized

  • use personal branches for major breakage changes (read: crazy stuff)