Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Here's some advanced information on what happens behind the scenes with body/rawbody rulesof different rule types.

Body rules: body RULENAME /foo/

...

  • Pattern /clause/ would result in 5 rule hits.
  • Pattern /^./ would result in 3 rule hits.

When using rules with extended characters / diacritics, you should always use both ISO-8859-1 / UTF-8 encodings.
Body content can be different depending on normalize_charset setting. If matching "fügen", see these examples:

  • body FOO /fügen/    (BAD)
    • Does not work when normalize_charset 1 and mail is converted from ISO-8859-1 to UTF-8
  • body FOO /fÃŒgen/   (BAD)
    • Does not work when normalize_charset 0 and mail is ISO-8859-1
  • body FOO /f(?:\xfc|\xc3\xbc)gen/
    • Works for both encodings, and file is also now very portable and not encoding dependent
    • You can use UTF-8 / ascii table tools found with google, or try perl for hex convert:
      • perl -MEncode -e 'print unpack("H*",encode("UTF-8","ü"))."\n"'
      • perl -e 'print unpack("H*","ü")."\n"'
  • You can also try some replace_tags found in default ruleset, that match different variations:
    • body FOO /f<U>gen/
    • replace_tags FOO

As body is processed in raw bytes, Unicode-regex features like \p{} can not currently be used.

Rawbody rules: rawbody RULENAME /foo/

...

  • When using anchoring (/^foo/), it will only match the start of a chunk.
    • I.e. it's not possible to match a beginning of part 100% accurately, if it's larger than 1-4kB.

Header rules: header RULENAME Header =~ /foo/

If there are multiple headers named "Header", the matched string contains each of the headers, newline separated, starting from first (topmost).

  • If Header:raw is used, all whitespace and newlines are preserved. Again multiple headers are concatted in the same matching string.
  • When using anchors (/^foo/), use m-modifier if any of the duplicate Headers should match. Without, only the first header (line) will match.