Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: [Original edit by JustinMason]

...

Here's some ideas.

Sandboxes

Initially, the rules project sandboxes SVN will consist of the existing (empty) rules directory in Subversion (the CVS replacement used by the ASF). Each committer will have their own sandbox to begin development in an unconstrained manner:

rules/sandbox/<username>/

Every person who has listed their rule set on the Apache SpamAssassin Wiki will be invited once the PMC approves the project; there are some rule sets only listed at SARE or Exit0, but those people are invited to join too, of course. There is absolutely no quality or experience requirement for the sandbox although we may later provide some tools to make it easier to avoid name collisions and such.

It is expected that someone (don't know and don't care who) will eventually write scripts to test, filter, and pull rules automatically into the production rules. I am intentionally deferring decisions around that area, though.

What does providing a sandbox for everyone do?

  • easy to join (you just have to sign a CLA and get an @apache.org account)
  • no expectation of... well, much anything; no quality or experience requirement for the sandbox
  • easy for us to import rules (manually or automatically) into main rule body
  • easy to move forward with further development around automatic updates and all of the other (hard) ideas we've talked about, but I really want to keep this dirt simple.
  • ability to help direct future development of the rules project (as it extends beyond sandboxes, sandboxes will remain just sandboxes, of course).
  • can produce multiple "output rule sets" in the long run: conservative, aggressive, sub-areas: bounces, drug rules, etc.
  • uses SVN and therefore has version control

In other words, this solves the main part of our "rules problem" – the hurdle of getting rules "over the wall". No longer will we need individual bugs for rule submissions, or need to go to 3 different sites to look for rule ideas, etc. Many of our best rules have come from SARE and the Wiki.

Also, it's expected that many of the rules will never go into the main rules body – someone may write rules for a specific type of annoying mail (not even necessarily spam), or maybe someone will be focused on super-aggressive rules for the brave folks out there. We can even produce multiple "output rule sets" in the long run: conservative, aggressive, sub-areas: bounces, drug rules, etc.

Some notes picked out of followup discussion:

It is possible to keep rules 'private', and in your own checkout only, by not checking them into SVN.

If you do want to have the rules visible for collaboration, but not used for automatic mass-checks or promotion, that could be done by just keeping them in a file that doesn't end in ".cf". (SpamAssassin's standard is that they have to end in ".cf" to be considered valid rules files.)See RulesProjectSandboxes.

Mass-checking

LorenWilton noted 'A big part (perhaps the biggest part) of rules development is the mass check. Most anyone can develop a rule on their home system and see how they *think* it works. Some few (but not many) people can do a mass-check on their home system and see how it *really* works - *for them*. As proposed, this rules project doesn't address the most important part of a rules project -
some way to check the rules against a fairly large corpus.'

...