Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: [Original edit by JustinMason] update mass-check wiki docs

...

First, you need HandClassifiedCorpora. Let's say that's made up of two maildir mbox folders, "/path/to/ham" and "/path/to/spam".

...

No Format
    cd masses
    ./mass-check --progress \
              ham:dirmbox:/path/to/ham \
              spam:dirmbox:/path/to/spam

This will create two files, "ham.log" and "spam.log" containing the hitting rules, read from the rules dir "../rules" as they are applied to that corpus. Each line of the two log files represents details about one email message, and there's a line for every message.

mass-check also takes other options to control whether network tests are run, whether multiple processes are run in parallel, how the output is presented, etc.; read the comments at the top of the file for details. Here's some key bits:

Configuration File

Mass-check reads a "user_prefs" file in "spamassassin/user_prefs". You need to create this yourself, it will not be created for you.

Using network tests

Wiki Markup
For mass-checks for scoresets 1 or 3, using network tests, you need to provide the {{\-\-net}} switch.  Ensure Net::DNS, Mail::SPF::Query, Razor (InstallingRazor), Pyzor (InstallingPyzor) and DCC (\["InstallingDCC"\]) are installed.

...

No Format
    cd masses
    mkdir spamassassin
    rm spamassassin/bayes*
    echo "use_bayes 1" > spamassassin/user_prefs

or to turn it off:

No Format

    cd masses
    mkdir spamassassin
    echo "use_bayes 0" > spamassassin/user_prefs

Once mass-check completes

The next step is to run hit-frequencies: see HitFrequencies for details.

Usage

Wiki Markup
usage:[BR] mass-check \[options\] target ...

...

--bayes

report score from Bayesian classifier BR

Usage: Targets

non-option arguments are used as target names (mail files and folders), the target format is: <class>:<format>:<location> BR

class

is "spam" or "ham" BR

format

is "dir", "file", "mbx", "mbox", or "mboxdetect" BR

location

is a file or directory name. Globbing of ~ and * is supported. BR

"detect" can be used as a format. This assumes "mbox" for any file whose path contains the pattern "/\.mbox/i", "file" anything that is not a directory, or "directory" otherwise.

...

CategorySoftware