Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

First off, the sandboxes idea greatly increases the number of people who can check rules into SVN. Secondly, the barriers to entry for getting a sandboxes account are much lower.

Some bulletpoints from discussion, needs expanding:

sandbox:

  • each user gets their own sandbox as discussed on RulesProjMoreInput
  • checked-in rules in the sandboxes are mass-checked in the nightly mass-checks
  • to migrate a rule from "sandbox" (dev) to "core" (production) ruleset uses C-T-R; ie. votes are not required in advance
  • C-T-R to migrate from "sandbox" to "extra" ruleset

Rules that get promoted from a "sandbox" to "core" should pass the following criteria:

  • S/O ratio of 0.95 or greater (or 0.05 or less for nice rules)
  • > 0.25% of target type hit (e.g. spam for non-nice rules)
  • < 1.00% of non-target type hit (e.g. ham for non-nice rules)
  • not too slow (wink)
  • TODO: criteria for overlap with existing rules? BobMenschel: The method I used for weeding out SARE rules that overlapped 3.0.0 rules, was to run a full mass-check with overlap analysis, and throw away anything where the overlap is less than 50% (ie: keep only those rules which have "meaningful" overlap). Manually reviewing the remaining (significantly) overlapping rules was fairly easy. The command I use is: perl ./overlap ../rules/tested/$testfile.ham.log ../rules/tested/$testfile.spam.log | grep -v mid= | awk ' NR == 1 { print } ; $2 + 0 == 1.000 && $3 + 0 >= 0.500 { print } ' >../rules/tested/$testfile.overlap.out

A ruleset in the "extra" set would have different criteria.

  • DanielQuinlan suggested: The second, a collection that do not qualify for rules/core. For example, SpamAssassin intentionally doesn't filter virus bounces (yet, at least), but there is a good virus bounce ruleset out there.
  • BobMenschel: Similarly, an "extra" rules set might include rules that positively identify spam from spamware, but hit <0.25% of spam. Or an "aggressive" rules set might include rules that hit with an S/O of only 0.89, but push a lot of spam over the 5.0 threshold without impacting significantly on ham.

We can also vote for extraordinary stuff that doesn't fit into those criteria...

private list for mass-checks:

...

Getting rules from the sandbox, into the distribution, is dealt with on RulesProjSandboxes, in the 'Rule Promotion' section on down.