The syntax for mail filtering and mail disposition is mainly (entirely?) relegated to the "SIEVE" side of MFL. Sieve is a relatively easy to understand and easy to write language-- a lot of what you might want to do with your mail can be done entirely using Sieve constructs. It's easy enough that you can learn a lot about it just by browsing some examples. The C side of the MFL language is provided, in part, for those who want to orchestrate more elaborate control over mail delivery and over those SIEVE constructs.
This is not a language manual. This is more like a set of notes about MFL along with some simple examples. If you know "Sieve" (or know how to read the SIEVE RFC or look at some examples), and if you know "C" (or don't care about using the "C-like" side of MFL), these notes and examples should get you going.
You may want to skip directly to:
Background
Although MFL may be now used in different utilities, it was developed
for the mail delivery agent (now called mvmda). It was very tempting to
invent something completely new for its language; for instance,
logic-based
or assertion-based languages seemed like they might fit the bill. But
after a few of those flights of fancy, we decided that we wanted to find
something that was easy to understand even by non-programmers, and yet
might be of use to programmers as well. These utilities are really just
for helping people deal with their mail, so we wanted something that at
a basic level was fairly easy to use and configure and with which to
achieve reasonable goals, but which could be used in more complicated
ways for those who wanted to do so. We also wanted to get something
accomplished and not go off on a language quest. So we decided to use a
fairly simple syntax for the basic mail controls, but also allow the use
of a procedural programming language, as well as other special
extensions, to support more complex configuration. The procedural level
we chose was like "C", so we call that part the "C-like" language.
We had run across the "SIEVE" language quite some years ago, when there was an internet draft put out by cyrusoft. In its form at that time, SIEVE looked reasonable: it provided some control structures that described some simple ways to look at mail, without being in itself a full-blown programming language. It seemed the ideal thing to wrap a procedural language around: making a nice union of a language providing control flow and complex evaluation and one providing basic mail handling syntax. There's been a bit of a cloud there, though: SIEVE was enventually codified into an RFC, and recently a lot of work has been going on to propose extensions to it in various ways. Some of the extensions make the integration into an enclosing language more difficult. However, one does not have to accept all extensions, and indeed some of the extensions make a lot of sense.
At any rate, we combined a C-like procedural language and a SIEVE-like mail filtering language, and called it "MFL." To get anything out of MFL, you need to use the SIEVE parts of the language-- and in fact the SIEVE parts can be used without using any C-like syntax. So we'll start there.
MFL is like SIEVE
The basic SIEVE definition is set out in
RFC 5228, superseding RFC 3028 (see also the
related reading area.)
It is a control
language that allows you to perform tests on parts of a mail message,
and take actions that dispose of the mail message in various ways.
Because MFL combines a C-like syntax and a SIEVE syntax, all SIEVE language elements must be enclosed in a "sieve" block, which is the keyword sieve followed by a code block enclosed in curly braces. (Each utility using MFL may offer exceptions to this; for example the mvmda mail delivery agent can be instructed to assume that the script starts out in sieve mode.) A sieve block can appear anywhere in MFL that a C-like statement or an expression term can appear. (SIEVE constructs always return a value, even if that value is simply a completion status.)
Sieve statements fall into three broad categories: control, test, and action. A control statement affects the flow of control (e.g. by evaluating a test statement and conditionally executing other statements as a result). A test statement tests a condition, and an action statement performs some function such as saving mail into a mailbox. Any of these sorts of statements can be used as a SIEVE element, or SIEVE statements can be combined into a SIEVE program section.
For example, the following is a section of SIEVE code (enclosed, as normally required, in a "sieve" block):
sieve { if header :is "From" "big@boss.com" { discard; } else { keep; } }Whereas the following illustrates how a SIEVE element can be used as part of a C-like expression:
int score; /* Assign a big score for this */ score += 64 * sieve { header :is "from" "big@boss.com" }; if (score > 500) sieve { discard; } else sieve { keep; }
Statement | Type | Status | Comments |
---|---|---|---|
RFC: rfc5228 (the fundamental SIEVE spec) | |||
address | Test | Complete | |
allof | Test | Complete | |
anyof | Test | Complete | |
discard | Action | Complete | |
else | Control | Complete | |
elsif | Control | Complete | |
envelope | Test | Complete | Requires capability "envelope" |
exists | Test | Complete | |
false | Test | Complete | |
fileinto | Action | Complete | Requires capability "fileinto"; Also see "MV Extensions" note below. |
header | Test | Complete | |
if | Control | Complete | |
keep | Action | Complete | |
not | Test | Complete | |
redirect | Action | Complete | |
reject | Action | Complete | Requires capability "reject" |
require | Action | Complete | |
size | Test | Complete | |
stop | Action | Complete | |
text | lexical | Complete | multi-line text literal using the keyword "text"; |
true | Test | Complete | |
# | lexical | Complete | See "Misc Notes" section. |
RFC: rfc3431 (SIEVE extension: relational tests) | |||
:count | Tagged option | Complete | |
:value | Tagged option | Complete | |
[capability] | These elements require capability "relational" | ||
RFC: rfc3598 (SIEVE extension: subaddress) | |||
:detail | Tagged option | See notes | |
:user | Tagged option | See notes | |
[capability] | Requires capability "subaddress" | ||
[notes] | Was formerly Internet Draft draft-murchison-sieve-subaddress. This RFC identifies a useful function: to be able to isolate the base recipient name from the extension part for mail systems such as qmail which allow extension addresses. Will probably be implemented in mvmf, no timeframe. | ||
RFC: rfc3685 (SIEVE extension: spamtest and virustest) | |||
spamtest | Test | Unsupported | |
virustest | Test | Unsupported | |
[capability] | Requires capability "spamtest". Also requires that capability "relational" be enabled. | ||
[notes] | Was formerly Internet Draft draft-daboo-sieve-spamtest. This RFC specifies a couple of tests against any spam and virus analysys that may have been applied and normalized into simple status information by the underlying SIEVE implementation. We have other more fined-grained assessment in mind for mvmf, and so are not going to implement this any time soon, if at all. | ||
RFC: rfc3894 (SIEVE extension: copying without side effects) | |||
:copy | Tagged option | Complete | |
[capability] | Requires capability "copy" | ||
[notes] | Was formerly Internet Draft draft-degener-sieve-copy | ||
Draft: draft-degener-sieve-editheader | |||
addheader | Action | Complete | |
deleteheader | Action | Complete | |
replaceheader | Action | Complete | |
:index | Tagged option | Complete | |
:last | Tagged option | Complete | |
:newname | Tagged option | Complete | |
:newvalue | Tagged option | Complete | |
[capability] | These elements require capability "editheader" | ||
[variables] | See the notes in the section about the Sieve "variables" extension. | ||
[draft] | The specification is still a draft and as such is subject to change or removal. | ||
[new] | Note: new version draft-degener-sieve-editheader-01 has a couple of minor changes that we have not yet incorporated (nor necessarily agree with) | ||
Draft: draft-ietf-sieve-imapflags-00.txt | |||
[thoughts] | We're interested in watching this, as it would be useful to manipulate IMAP flags, however this draft still needs shaping up. No details to report here. | ||
Draft: draft-murchison-sieve-regex | |||
:regex | Tagged option | Complete | |
[capability] | Requires capability "regex" | ||
[notes] | Matched subparts are available via the C-like language elements. | ||
[draft] | The specification is still a draft and as such is subject to change or removal. | ||
Draft: draft-ietf-sieve-refuse-reject | |||
refuse | Action | Unsupported | |
[capability] | Requires capability "refuse" | ||
[notes] | Specifies an action "refuse" that will refuse an email message at SMTP-time, rather than trying to reject or discard it later. Since mvmda doesn't yet operate at SMTP-time, we don't have any support for this. However it's definitely worth implementing should mvmda be hooked into an SMTP engine in any way. | ||
[draft] | The specification is still a draft and as such is subject to change or removal. | ||
Draft: draft-ietf-sieve-variables | |||
set | Action | See notes | |
setdate | Action | See notes | |
string | Test | See notes | |
:length | Tagged option | See notes | |
:lower | Tagged option | See notes | |
:lowerfirst | Tagged option | See notes | |
:upper | Tagged option | See notes | |
:upperfirst | Tagged option | See notes | |
[capability] | Requires capability "variables" | ||
[thoughts] | We have mixed feelings about this. It provides a reasonable facility for the SIEVE language, but MFL already provides much more powerful access to variables already. On the other hand, we could implement it fairly easily, so we may give it a go. Not to mention that I'd like to have the "string" test command (even if I don't like its name). | ||
[draft] | The specification is still a draft and as such is subject to change or removal. | ||
Draft: draft-ietf-sieve-vacation | |||
vacation | Action | Complete | |
:addresses | Tagged option | Complete | |
:days | Tagged option | Complete | |
:from | Tagged option | Complete | |
:handle | Tagged option | Complete | |
:mime | Tagged option | Complete | |
:subject | Tagged option | Complete | |
[capability] | Capability "vacation" required | ||
[notes] | This implementation requires the :from option as
it does not want to guess the email address of the script
owner.
The draft specifies that if :handle is omitted, one be synthesized from the amalgamation of other options. This implementation does not do that, it assumes a handle of "default" if one is not given. |
||
[draft] | The specification is still a draft and as such is subject to change or removal. (Though that's not likely.) | ||
Draft: draft-degener-sieve-body | |||
[capability] | Capability "body" required | ||
[thoughts] | We may implement this, but have other ideas on the matter that fit into the MFL framework a little tighter. However, this may be a useful first step in its proposed form. | ||
[draft] | The specification is still a draft and as such is subject to change or removal. | ||
MV Extension: C | |||
C | lexical | Complete | Introduces a block of C-like code, which must be enclosed in curly braces. This is not a Sieve extension, as it is part of MFL, and does not need to be enabled via a Sieve "require" statement. |
MV Extension: dnsbl | |||
dnsbl | Test | Complete | Requires capability "vnd.mvmf.dnsbl" . See notes below. |
:ip | Tagged option | Complete | Specify an IP address to be tested, overriding the default. |
MV Extensions in general | |||
See below for notes about MV extensions. | |||
MV Extension: sieve | |||
sieve | lexical | Complete | Introduces a block of sieve code, which must be enclosed in curly braces. This allows a script writer to write code that is guaranteed to be in "sieve" mode without having to know the encompassing context. Useful, for example, for script code that is meant to be included (via @include) by some other script. This is not a Sieve extension, as it is part of MFL, and does not need to be enabled via a Sieve "require" statement. |
Misc Notes | |||
:comparator | "i;ascii-casemap" and "i;octet" are complete;
i;ascii-numeric is also implemented but is not appropriate in all cases. This comparator requires capability "comparator-i;ascii-numeric" . i;ascii-casemap is the default. | ||
# comments | As of the 20050825 release, MFL supports the "#"-style end of line comment. This can conflict somewhat with MFL's preprocessor statements if you enable the '#' character as a preprocessor introducer character. However, also as of the 20050825 release the default preprocessor introducer character has been changed to '@' which does not conflict with this style of comment. Note that MFL also supports the C-like "//" syntax to begin a comment to the end of the line, as well as "/*..*/" bracketed comments. |
sieve { if header "to" "user@example.com" { C { to_me = 1; } keep; } }This is not a Sieve extension, it is simply part of the MFL implementation of Sieve as an embedded language, and thus is not enabled via a Sieve "require" statement.
dnsbl [:ip <ipaddr: string>] <blnames: string-list> <result-codes: string-list>
With the :ip option, the given IP address ipaddr is tested against the specified DNSBLs. No mail message need be open to use this form.
With no :ip option, the dnsbl statement tests each responsible IP address (each IP address that is believed to be responsible for transporting the message to the local server) against the specified DNSBLs.
The statement returns true if the IP address was found in one of the DNSBLs, and false if it was not.
Note: a list of responsible IP addresses is maintained by any application that includes and supports this language construct. Your MFL code may call a built-in-function "$msg_rip_add()" to add an IP address to this list. This would normally happen when the application calls a specifically-named MFL function, i.e. a "hook," at a relevant point in its processing. For example, the mvmda (Mail Delivery Agent) calls a hook when it has opened and scanned the incoming message. See each application's documentation for descriptions of any hooks supported.
DNSBL blacklist names and result code types are registered in a system-wide file dnsbl.conf normally located in /usr/local/share/mvmf . The blacklist name identifies a domain name suffix to be used for DNSBL lookups, and a result code is a mnemonic name for a result returned by that DNSBL. Generally all blnames have a result code "std" defined as their standard result. Some DNSBLs have various results indicating various things. Result code "*" will match any result returned by the DNSBL lookup.
The code section:
sieve { if dnsbl ["spamcop", "njabl"] "std" { discard; stop; } }tests all responsible IP addresses against the standard result codes of both the "spamcop" and "njabl" DNSBLs, discarding the message if one of the IP addresses is listed.
This code:
int flag; flag = sieve { dnsbl :ip "127.0.0.2" "spamhaus" "sbl" };sets the variable "flag" depending on whether the specific IP address 127.0.0.2 is found in the "spamhaus" sbl DNSBL, as does this code:
string prefix = "127."; int flag = sieve { dnsbl :ip [ prefix + "0.0.2" ] "spamhaus" "sbl" };
The "dnsbl" capability must be enabled via SIEVE's "require" statement in order to use this statement, using a capability name of "vnd.mvmf.dnsbl" . Earlier MFL implementations used a capability name of "dnsbl" instead of the more proper vendor-specific name. When you configure and build the mvmf package you can still choose to support the old capability name as an alias.
// Assumes that this has been executed somewhere in admin mode: $admin_int_set( "pipe_allow", 1 ); . . . sieve { fileinto "|process-report"; }
MFL also has an interface to system-defined plugins using the $cusp_ family of built-in-functions. The CUSP interface is intended for helper applications that have a more compex interface than simply piping a message into an external program. See, for example, the clamdif interface to a clamav anti-virus daemon.
string my_other_domain; my_other_domain = "example.com"; sieve { if not address :domain "To" [ my_other_domain ] { keep; } else { redirect [ (string)"myself@" + my_other_domain ]; } }
What's the syntax conflict that requires that expressions only be used inside square brackets? The problem comes down to the fact that in SIEVE statements, terms outside of square brackets are separated by spaces, while terms inside of square brackets are separated by commas. Imagine allowing expressions anywhere, such as in this potential case:
string header_name = "subject"; sieve { if header :contains header_name ["ADV"] { discard; } }If the parser is allowed to look for an expression outside of a string list (i.e., outside of square brackets), it can easily think that
header_name [ "ADV" ]follows the syntax of an array reference. While this simple case might seem easy to resolve, more complex cases are not. Fortunately, terms inside of string lists are separated by commas, removing that kind of ambiguity there. (Use of commas in that part of the SIEVE language seems a mite inconsistent, but I'm not complaining.)
MFL knows about the MIME structure of messages, and has the concept of a "current message part." All header tests are done in the context of this current message part. In the default state, the top-level message part (i.e., the message headers) are selected. MFL scripts may select other message parts (e.g. the children of a multipart message part). Let's say you have a message whose top level content type is "multipart/alternative" with two children, one with content type "text/plain" and the next with "text/html". Consider these three statements:
/* A */ sieve { header :matches "content-type" "multipart/*" } /* B */ sieve { header :matches "content-type" "text/plain*" } /* C */ sieve { header :matches "content-type" "text/html*" }With the top (default) message part selected, only statement A evaluates to TRUE. With the first child selected, only B is true, and with the second child selected, only C is true.
MFL is like C
MFL's enveloping language for procedural and logic flow is C-like in
nature. (We won't explain "C" here, but if you are reading this far
you probably either know it or can find out about it.) We say
"C-like" because it gets its data typing and control flow from C,
but it doesn't implement a full C language.
What's in MFL's C-like component: fundamental and compound data types, expressions, control flow statements, variables, initializers, functions, and a cpp-like preprocessor.
What's not: switch statement (and case labels), function prototypes (except for function definitions), local variables inside any compound block (including functions);
Oddities: MFL C-like variables may contain "$". Thus "$a" and "a$" are legal variable names. Functions supplied as part of mvmf will always name variables and functions starting with '$' -- other script writers should avoid doing that.
Thing | Status | Comments |
---|---|---|
Fundamental data types and modifiers | ||
unsigned | Supported | May be used as a modifier to an integer type, or by itself as an abbreviation for unsigned int |
short | Supported | May be used as a modifier to int, or by itself as an abbreviation for short int |
long | Supported | May be used as a modifier to int, or by itself as an abbreviation for long int |
char | Supported | A 1-byte value |
int | Supported | A natural integer (currently 2-byte value) |
$int4$ | Supported | MFL extension to guarantee 4-byte int |
(short int is 2 bytes; long int is 4 bytes.) | ||
float | Supported | Floating point number |
double | Supported | Double precision floating point number |
string | Supported | MFL extension for character strings. |
Aggregates and metatypes | ||
typedef | Supported | Defines a new type in terms of another type definition |
struct | Supported | A data structure |
union | Supported | A data overlay |
enum | Supported | Enumerated types. See note "E". |
[] | Supported | Arrays. |
* | Supported | Pointers. See note "P". |
Control statements | ||
break | Supported | Exit loop. |
continue | Supported | Next loop iteration. |
do | Supported | Loop control. |
if | Supported | Conditional execution |
else | Supported | (implemented as part of "if") |
for | Supported | Loop control |
return | Supported | Return from a function (with optional return value) |
switch..case | Not supported | Value detection -- no plan to support this |
while | Supported | Loop control |
pv$ | Supported | MFL extension to print to stdout. See note "PV". |
[built-in functions] | Supported | Described here |
[MFL functions] | Supported | User-written functions (see below) |
Expressions and evaluation | ||
C | Supported | MFL extension to introduce a C-like code block, which must be enclosed in curly braces. Useful to guarantee that a script is in C-like mode, e.g. for a code snippet that is intended to be included by another script. |
sizeof | Supported | Returns number of bytes of a variable, storage element, type, or expression. See note "SO". |
sieve | Supported | MFL extension to introduce a SIEVE code block, which must be enclosed in curly braces. |
( ) | Supported | Parenthetical grouping for explicit precedence |
? : | Supported | Conditional expression: test ? truth : falsth |
! | Supported | "!" Operator (boolean not) |
~ | Supported | "~" Operator (bitwise complement) |
, | Supported | "," Operator (return second of two expressions) |
= | Supported | "=" Operator or assignment |
== | Supported | "==" Operator (compare equal) |
==^ | Supported | "==^" Operator (string compare equal, ignore case) See note "S". |
!= | Supported | "!=" Operator (compare not equal) |
!=^ | Supported | "!=^" Operator (string compare not equal, ignore case) See note "S". |
=. | Supported | "=." Operator (regex matching, pattern on RHS) See note "S". |
!=. | Supported | "!=." Operator (regex non-matching, pattern on RHS) See note "S". |
=? | Supported | "=?" Operator (glob-style matching, pattern on RHS) See note "S". |
=?^ | Supported | "=?^" Operator (glob-style matching, ignore case, pattern on RHS) See note "S". |
!=? | Supported | "!=?" Operator (glob-style non-matching, pattern on RHS) See note "S". |
!=?^ | Supported | "!=?^" Operator (glob-style non-matching, ignore case, pattern on RHS) See note "S". |
< | Supported | "<" Operator (compare less than) |
<^ | Supported | "<^" Operator (string compare less than, ignore case) See note "S". |
<= | Supported | "<=" Operator (compare less than or equal) |
<=^ | Supported | "<=^" Operator (string compare less than or equal, ignore case) See note "S". |
<< | Supported | "<<" Operator (shift left) |
<<= | Supported | "<<=" Assignment operator (shift left) |
> | Supported | ">" Operator (compare greater than) |
>^ | Supported | ">^" Operator (string compare greater than) See note "S". |
>= | Supported | ">=" Operator (compare greater than or equal) |
>=^ | Supported | ">=^" Operator (string compare greater than or equal, ignore case) See note "S". |
>> | Supported | ">>" Operator (shift right) |
>>= | Supported | ">>=" Assignment operator (shift right) |
+ | Supported | "+" Operator (add) |
+= | Supported | "+=" Assignment operator (add) |
++ | Supported | "++" Operator (increment) |
- | Supported | "-" Operator (subtract) |
-= | Supported | "-=" Assignment operator (subtract) |
-- | Supported | "--" Operator (decrement) |
* | Supported | "*" Prefix operator (pointer dereference) |
* | Supported | "*" Infix operator (multiply) |
*= | Supported | "*=" Assignment operator (multiply) |
/ | Supported | "/" Operator (divide) |
/= | Supported | "/=" Assignment operator (divide) |
% | Supported | "%" Operator (modulo) |
%= | Supported | "%=" Assignment operator (modulo) |
[ | Supported | "[" Operator(kinda) (array reference) |
& | Supported | "&" infix operator (bitwise AND) |
& | Supported | "&" Prefix operator (address-of) |
&& | Supported | "&&" Operator (boolean AND) |
&= | Supported | "&=" Assignment operator (bitwise AND) |
| | Supported | "|" Operator (bitwise OR) |
|| | Supported | "||" Operator (boolean OR) |
|= | Supported | "|=" Assignment operator (bitwise OR) |
. | Supported | "." Operator (member reference) |
-> | Supported | "->" Operator (member reference) |
Preprocessor | ||
MFL sports a basic cpp-like preprocessor; this
section lists the preprocessor elements you might expect.
The preprocessor is conceptually responsible for removing
comments and interpreting preprocessor directives.
Directives are indicated in a script by using '@' as
the first character on the line (i.e., in the first column). (The
use of '#', as with C, conflicts with the Sieve-mandated comment
characters. Nevertheless when you configure mvmf you can enable
the use of '#' instead of or in addition to the '@' character.)
There are more elaborate notes about the use of the preprocessor later in this document. |
||
@define | Supported | Defines a preprocessor constant or macro. |
@else | Supported | Starts the "else" part of a preprocessor conditional |
@endif | Supported | Ends a preprocessor conditional block |
@help | Supported | Prints the supported preprocessor statements (useful only in interactive mode) |
@ifdef | Supported | Begins a conditional block that is executed if a preprocessor symbol is defined. |
@ifndef | Supported | Begins a conditional block that is executed if a preprocessor symbol is not defined. |
@include | Supported | Includes the contents of another MFL file at this point in the compilation/interpretation |
/*..*/ | Supported | Block comment |
// | Supported | Comment to end of line |
Preprocessor extensions | ||
MFL has some other preprocessor directives. | ||
@ifdef_func | Supported | Begins a conditional block that is executed if an mfl function is defined. |
@ifdef_var | Supported | Begins a conditional block that is executed if an mfl variable is defined. |
@ifndef_func | Supported | Begins a conditional block that is executed if an mfl function is not defined. |
@ifndef_var | Supported | Begins a conditional block that is executed if an mfl variable is not defined. |
@include_noerr | Supported | Like @include, but doesn't complain if the file is not available. Useful for loading control files that don't have to exist. |
Note E: A specific assignment to an enum member definition is not supported, e.g.:
enum { aa, bb=3, cc }does not work in MFL.
Note P: Pointers are supported inasmuch as you can point to some other data storage defined in an MFL program. Pointers are constrained at run-time only to reference a particular data object.
Note PV: pv$ is basically a hack to allow debugging printouts. You can print a single value e.g.
int x; pv$ x;or you can print a printf-like format string and a single argument, e.g.:
int x; x = 23; pv$ "x is %d\n", x;
Note S: These string comparison operators are MFL extensions.
Note SO: The MFL interpreter does late type binding and late evaluation; there is currently no way for the interpreter to figure out the type of an expression without evaluating it. sizeof can give you the size of an expression, but note well that the expression will be evaluated in the process. E.g. in:
int x = 0; int sx; sx = sizeof( x = 3 );sx will be the size of the expression (an int), and x will be set to 3!
int x = 3; int y = {3};instantiate x and y and set both values to 3. (It's an inconsistency of C syntax (so we follow it) that scalar initializers can optionally be enclosed in braces, yet initializers for scalars within aggregates can not.)
struct { int key; string val; } kvt[3] = { { 10, "key 1" }, { 20, "key 2" }, { 30, "key 3" } };
int i; i = { 3 + 7 }; // This is wrong i = { 3 + 7; }; // This is correctThe mvmf application may also be configured (when it is compiled) to allow the use of some native C-like statements as expression terms. These statements include do, for, if, pv$, and while. sizeof is always available as a term, while break, return, and continue never are. As with statements inside of compound blocks, the statement as an expression term must still be fully-formed, which can result in some odd-looking code, as in this contrived example:
int i; i = if ( foo() ) 3; else 4; ; // looks odd i = if ( foo() ) {3;} else {4;} ; // perhaps better.
Strings are implemented using something called a refstr, which is a view of a referenced string. Multiple views to common strings may be obtained via string pointers (i.e. (string *)) which can be dereferenced to access their reference target. When a string is modified, any views into the underlying string object are modified to reflect the change. For example, consider this MFL code sequence:
string s = "I am a test"; string *sp = $str_sub(s, 7, 4); // points to "test" string *s1P = $str_sub(s, 2, 2); // points to "am" *s1P = "used to be";The string s is now
"I used to be a test"and the string pointer sp still points to
"test"
Every string, including the targets of string pointers, has the following attributes:
string s = "hello there"; string *sP; sP = $str_sub(s, 4, 4); // "o th" $str_bx_set(*sP,3); // Sets to 3, the 'h' position $str_bx(s); // will initially be 0 $str_bx(*sP); // is still 3.
Some operations on strings:
string s = "hello"; if ( s =. "h.*o" ) some code here;the test would succeed. '!=.' is the notted version of the test.
Notes about string pointers:
string s = "abcdefgh"; string *sp; sp = $str_sub( s, 3, 2 ); // Now points to "de" ++sp; // Now points to "ef" sp - 4; // points to "ab" sp - 5; // points to "a"
Note that a string literal is not a string until it is coerced into one. Since those kinds of coercions happen automatically in many places you might not notice the need for it. But be aware that:
"abcdefg" + 3is essentially an array reference to the third character of the character array (not a string!) "abcdefg", while
(string)"abcdefg" + 3evaluates to the string "abcdefg3" since the first term is coerced via the typecast.
An MFL function has a C-like syntax, with a declaration of a return type, a formal argument list, and a function body. One quirk of MFL functions is that a function is treated syntactically like a variable declaration, one side-effect of which is that it has to be terminated with a semicolon (or a comma and another declaration using the same type). An MFL function therefore looks something like this:
/* Recursive function to return digits of an integer separated by spaces */ string dp( int n ) { if ( n < 10 ) return (string)n; return dp( n/10 ) + " " + (string)(n%10); };Using the above, dp(25821) returns the string "2 5 8 2 1"
Comment removal. Conceptually, the preprocessor removes comments from the script before it is interpreted. There are two styles of comments: block comments and rest-of-line comments.
A block comment begins with the pair of characters /* and ends with the pair */. Comments do not nest: once a comment block is opened with /* the next */ closes it, even if another /* is encountered first.
A rest-of-line comment begins with the pair of characters // or with the single character # and ends at the end of the line. This is useful for annotating a single statement.
The following illustrates both kinds of comments:
/* Basic SIEVE setup */ sieve { require ["fileinto", "envelope"]; } int score = 0; // Declare and initialize score float f; # A temporary /* Now look for a special "X-Spam-Score" header and adjust our integer score according to the floating point value found there. */ if ( sieve { header :matches "X-Spam-Score" "*" } ) { f = $str_match(0); // pick up the score value if ( ( f >= 9.0 ) && ( f <= 9.9 ) ) ++score; // Significant value bumps score.As you can see: block comments can span lines, rest-of-line comments continue only to the end of the line, and (sometimes) too many comments can obscure the meaning rather than amplify it. (Unless we are illustrating how comments are implemented, of course.)
Macro substitution. The preprocessor has its own symbol table. Symbols in this table are variously thought of as preprocessor symbols, macro names, or manifest constants. The combination of a symbol and its value may be thought of as a macro. Whenever the preprocessor encounters one of these symbols in the input stream (e.g., in the script), the value that has been assigned to that symbol is used instead of the actual symbol. This is known as macro substitution or macro expansion. A macro is created via the "@define" preprocessor directive, described below.
There are two kinds of macros: those with arguments and those without. Actual arguments to a macro are supplied in a parenthesized list, with arguments separated by commas. (Depending on the way MFL is built, whitespace may or may not be allowed between the macro name and the opening parenthesis. To be safe, don't use whitespace here: follow the macro name immediately by an opening parenthesis.)
A simple example of macro definition and substitution:
@define ALTADDR "fred@example.com" sieve { redirect ALTADDR; }Macro substitution for ALTADDR occurs before the "redirect" statement is parsed: that statement is parsed exactly as if it were written:
redirect "fred@example.com";
A macro with arguments:
@define aab(a,b) (a+a+b) int i = aab(3,4);The preprocessor turns that into:
int i = (3 + 3 + 4);initializing variable i with a value of 10.
Macro references can not be recursive. While a macro is being expanded, it is prevented from further expansion until its value is completely substituted. (It's said that the macro is "painted blue" while it is ineligible for expansion.) This prevents a macro value from refering to the macro name itself, or to the name of another macro being expanded. Note also that macro substitution occurs on a token-by-token basis. Since a quoted string is an individual token, any macro names inside a quoted string are not substituted.
Preprocessor directives. The preprocessor is otherwise commanded via preprocessor directives. A preprocessor directive is indicated by a @ character at the first character position on a line, followed by a recognized preprocessor command. For compatibility with older mvmf releases, when you build and install mvmf, you can choose to enable '#' as an alternative preprocessor introducer character, in place of or in addition to the use of '@'. Note that the indentation in various examples is for clarity only: the @ (or '#') must occur at column 1. Whitespace may occur between the @ and the command and in fact is encouraged to indicate a nesting level. However, when one talks about a preprocessor directive, it's usually with the concatenation of the @ and the command name. Preprocessor directives:
@define NAME valueso that whenever NAME is encountered in the script after this, the value is used instead. Formal arguments may also be given by including them in parenthesis directly following the macro name. Each occurance of a formal argument in the macro body will be replaced by the actual argument when the macro is invoked. For example:
@define RS(rcpt) sieve { redirect rcpt; } RS( "bozo@example.com" )will cause this code to be used:
sieve { redirect "bozo@example.com"; }Macros can be used to stand in for commonly used strings or sequences, particulary for code that might be changed from time to time (thus you'd only have to change the macro definition rather than changing code in multiple places in your script). As with C guidelines, making your macro names uppercase gives a visual clue that macro names are being used.
/* @define DROPMAIL */ sieve { @ifdef DROPMAIL discard; @else keep; @endif }i.e., if DROPMAIL is defined to the preprocessor, the discard statement is executed. Otherwise, the keep statement is executed.
@include "data.mfl"inserts the contents of the file data.mfl located in the current directory (e.g., your home directory) or in the user-level include path, and
@include <common.mfl>inserts the contents of the file common.mfl that is found along the system-level include path.
Preprocessor statements are always acted on at parse time. You might have an elaborate MFL function that you store in its own file; using something like
@include "bigfunction.mfl"will always load and parse that file whether or not you ever need the function (assuming that this does not occur in a false preprocessor condition). If you only want to load the function when you know you are going to need it, you can include the file using a runtime parse and execute function, e.g.:
sieve { if envelope :is "from" "monthly-report@example.com" { C { $mfl_exec_string( "@include \"bigfunction.mfl\"" ); bigfunction(); } } }
Misc notes
Depending on how it was compiled, each mvmf application may have the
capability of executing system-wide or user-level MFL scripts when it
starts. These can be used to define commonly-used functions, hook
functions that are called automatically by the application at certain
stages, variables, and so forth.
Each application may also call specially-named MFL functions at particular points in the utility's execution. You may supply these hook functions to affect some aspect of the application's operation at the point the hook is called.
Application-specific details such as these are described along with each utility that incorporates MFL.
Future plans and ideas
This section has been incorporated into the
To Do page.
Examples
Examples are primarily relevant to each utility; please see the
documentation for each mvmf application that uses MFL.