Get rid of dryrun.log: in -n mode, don't log at all. I don't think
anyone ever actually used the dryrun log, and if we get rid of it,
then minc -n becomes a better mdfrm.
(find_mh_folder): Don't know why, but the MAGIC_TO crap wasn't doing
the same kind of match as the other filters, and you couldn't use
parts of the match in the folder name. Fix.
($SCAN_P_FOLDER, $SCAN_P_MESSAGE, $SCAN_P_FROM): New configurable
globals, the proportion each field should take in the scan lines.
(filter_mail): Use these new globals instead of instead of hard-coded
defaults, and move the loading of .minc before the use of these
globals so the user can change them.
(filter_mail): Need to print a \r before printing the status line,
as nothing is printed for spam messages, so these would just stack
up on the same line with two or more spams in a row.
(filter_mail): Stop popping items from header lists and just read the
last header value. Also get rid of control characters in the subject
(stupid spam!).
Pull in COLUMNS environment variable and default it to 80.
(filter_mail): Instead of printing which folders have been added to,
print one line per saved non-spam message: its folder, message number,
and from and subject headers. Use COLUMNS and hard-coded proportions
to determine how much of the line to allocate to each field printed.
(find_mh_folder): Take %headers as arguments, don't call get_headers
or log_headers here.
(filter_mail): .mincfilters, mincspam, and .minchooks are now just one
file, .minc. Now we don't magically know the name of some
spam_start_hook, spam_stop_hook, spam_check, and post_store_hook, but
instead run all the hooks in the @start_hooks, @stop_hooks,
@filter_hooks, and @post_store_hooks lists. Call get_headers and
log_headers here so we can pass them to the hooks.
epg [Thu, 25 Nov 2004 05:09:31 +0000 (05:09 +0000)]
(filter_mail): Support a post_store_hook, in case someone wants to do
post-processing for a message. This is getting out of hand; all these
hooks should be folded into a single .mincrc or something...
(getfiles): Just return readdir results, don't strip . and .. and
don't return absolute paths. We walk the file list enough times
already, there's no need to walk it for this.
(filter_mail): Skip . and .. in the @msglist.
(MAIN): Instead of passing getfiles $MAILDIR/new, chdir there and pass '.'.
New option -p. Using this option causes minc to print the filename
for each message before checking it for spam, which is useful for
debugging the spam filter.
My trick for defining is_spam() was broken and i never noticed because
i wasn't doing use strict in .mincspam; fix it with eval and also add
spam_start_hook and spam_stop_hook.
(kill_spam): Now call spam_start_hook(), save the list it returns, and
pass it to each spam_check() call (formerly known as is_spam()). Also
pass it to spam_stop_hook() after processing all spam. This allows
the spam processor to maintain state, and do other things like talk to
a co-process.
epg [Tue, 11 Mar 2003 01:13:18 +0000 (01:13 +0000)]
(find_mh_folder): DOH! The // match operation must be done
case-insensitive. I think it used to be and i accidentally changed it
when i changed the substitution operation to matching.
epg [Sun, 19 Jan 2003 03:11:14 +0000 (03:11 +0000)]
mdeliver/current/mdeliver.c:
(deliver): In rev 1582 i changed this from using rename(2) to
the recommended link(2) + unlink(2). But in minc i was using
open(2) + rename(2) instead, which is just as safe as link +
unlink but with one advantage. So switch to that model.
minc/current/minc:
(store_message): Document the open + rename procedure and
explain why it is used instead of link + unlink.
epg [Mon, 30 Dec 2002 02:25:03 +0000 (02:25 +0000)]
(logincoming): Move part of log_headers here.
(log_headers): Pass the array of contents for each header to
logincoming rather than indexing here. logincoming now checks for an
empty array (i.e. message missing this header) before trying to index
into it (it used to bomb).
epg [Sat, 28 Dec 2002 22:28:31 +0000 (22:28 +0000)]
Implement Doug's two killer features: 1) store all occurrences of a
header, not just the last, 2) use pure ordered structure for the
filters, allowing inter-mixing of filters (i.e. you can have some
List-Id filters followed by some X-Mailing-List filters followed by
more List-Id filters and they will all be matched in the correct
order).
Earlier i was using Tie::IxHash to make the filters hash ordered, but
i wasn't using it for the internal hashes so it was incomplete. I
didn't need hashes in teh first place; switching to nested arrays for
the filters structure solves this problem as well as Doug's killer
feature and as a bonus removes the Tie::IxHash dependency.
README:
Remove Tie::IxHash from list of non-standard modules.
minc:
Drop Tie::IxHash.
FILTERS and %headers become more complex structures (code and
documentation changes).
For now, use .mincfilter2 instead of .mincfilter (will change
back as soon as the new code has had a good shake-down).
(log_headers): Adapt for new %headers structure and add comment
cross-referencing the get_headers comment explaining the
structure.
(get_headers): Document the %headers structure. Build and return
the new structure, preserving multiple occurrences of the same
header.
(find_mh_folder): Use the new @FILTERS structure.
(MAIN): Use @FILTERS instead of %FILTERS.
In the closing pod section, fix the EXAMPLES section and add a
section documenting @FILTERS.
epg [Sun, 13 Oct 2002 18:58:03 +0000 (18:58 +0000)]
Fix long-standing bug where filters were applied out of order because
hashes are not guaranteed to stay in the order they were defined. Use
Tie::IxHash to make %FILTERS an ordered hash.
README:
Note that Tie::IxHash is now required.
minc:
use Tie::IxHash;
(%FILTERS): Make this an ordered hash with Tie::IxHash.
epg [Sun, 13 Oct 2002 16:32:26 +0000 (16:32 +0000)]
(store_message): Whoops, document that the for loop is a modified
version of the maildir delivery algorithm; now some of these other
comments make more sense.
(get_headers): Don't allow input from the message to break the
regex used to split headers.
Today i encountered a message with the following ilnes:
X-scanner: scanned by Inflex 1.0.12.3 -
(http: //www.gsm.com.my)
What we have is a broken pile of shit from gsm.com (unsurprisingly,
as i visit the URL they provide, i see that one of their frame
components spits out an HTML page with Content-Type: text/plain)
that doesn't know how to fold headers. At some point, some "helpful"
mail software (perhaps even my very own postfix installation; who
knows?) mangled what appeared to it to be an '(http' header; it
added a space after the colon.
Whether the mangling had happened or not, i'm sure minc would have
been confused. Doug Porter provided the fix (using perl's \Q and
\E in the split() regex to disable pattern meta-chars in $fieldname).