Implement custom scan of SPAM messages instead of using scan(1).
If the header of a message contains garbage characters (not 2047-quoted
garbage characters, actual multi-byte garbage!) (Chinese or Russian spam)
scan(1) will happily print them to the tty, which hurts.
(%SPAM): Replace @SPAM global list with a hash.
(store_message): Drop unused $status variable. Don't update @SPAM .
(scan_line): Factor out the actual scan line formatting from filter_mail to
this new function.
(filter_mail): Append message number and header hash to %SPAM .
(maildir_spam): Append message number and spam maildir file name to %SPAM .
(scan_spam): Add function to scan spam messages with scan_line . Deal with 3
cases: spam filtered via filter_mail and spam filtered via maildir_spam in
either run or -n mode. For the latter two, open the message file from the
mh SPAM folder or from the spam maildir, respectively, to load the header.
epg [Sat, 25 Mar 2006 01:39:55 +0000 (01:39 +0000)]
(get_highest_msgnum): Fix minor bug, returning one less than it should
be on the first call for a folder. This was only noticeable in -n mode
because store_message loops to find a fresh number.
epg [Thu, 19 Jan 2006 19:48:46 +0000 (19:48 +0000)]
(mark): Since new.c only uses the lists from .mh_sequences, it doesn't
parse them, simply appending here really hurts. Bite the bullet and
properly merge new messages into the existing .mh_sequences message
lists (i.e. 1-39 becomes 1-40).
epg [Wed, 18 Jan 2006 21:25:45 +0000 (21:25 +0000)]
This 35k message folder at work is getting burdensome. minc was
reading that entire folder everytime it stored a message there! Twice
because mark(1) does it too. The first is my fault, the second mh's.
I still haven't gone through mh to get rid of unnecessary readdirs, so
work around the problem here.
(get_highest_msgnum): Only read a directory once, saving the highest
message number and incrementing it on each call.
(mark): New function to add a message to a sequence.
(store_message): Use mark instead of running mark(1).
epg [Tue, 22 Nov 2005 00:59:40 +0000 (00:59 +0000)]
(find_mh_folder): Require $regexp to be a regular expression object
(qr//) instead of a string. Now it's up to the user whether it
matches sensitive to case or not, or uses any other options.
epg [Sat, 19 Nov 2005 00:46:01 +0000 (00:46 +0000)]
Even easier .folders sorting; now all you have to do is provide a list
of regexps, and the sorting will be according to their order. You can
still define your own sorter, if necessary.
Get rid of dryrun.log: in -n mode, don't log at all. I don't think
anyone ever actually used the dryrun log, and if we get rid of it,
then minc -n becomes a better mdfrm.
(find_mh_folder): Don't know why, but the MAGIC_TO crap wasn't doing
the same kind of match as the other filters, and you couldn't use
parts of the match in the folder name. Fix.
($SCAN_P_FOLDER, $SCAN_P_MESSAGE, $SCAN_P_FROM): New configurable
globals, the proportion each field should take in the scan lines.
(filter_mail): Use these new globals instead of instead of hard-coded
defaults, and move the loading of .minc before the use of these
globals so the user can change them.
(filter_mail): Need to print a \r before printing the status line,
as nothing is printed for spam messages, so these would just stack
up on the same line with two or more spams in a row.
(filter_mail): Stop popping items from header lists and just read the
last header value. Also get rid of control characters in the subject
(stupid spam!).
Pull in COLUMNS environment variable and default it to 80.
(filter_mail): Instead of printing which folders have been added to,
print one line per saved non-spam message: its folder, message number,
and from and subject headers. Use COLUMNS and hard-coded proportions
to determine how much of the line to allocate to each field printed.
(find_mh_folder): Take %headers as arguments, don't call get_headers
or log_headers here.
(filter_mail): .mincfilters, mincspam, and .minchooks are now just one
file, .minc. Now we don't magically know the name of some
spam_start_hook, spam_stop_hook, spam_check, and post_store_hook, but
instead run all the hooks in the @start_hooks, @stop_hooks,
@filter_hooks, and @post_store_hooks lists. Call get_headers and
log_headers here so we can pass them to the hooks.
epg [Thu, 25 Nov 2004 05:09:31 +0000 (05:09 +0000)]
(filter_mail): Support a post_store_hook, in case someone wants to do
post-processing for a message. This is getting out of hand; all these
hooks should be folded into a single .mincrc or something...
(getfiles): Just return readdir results, don't strip . and .. and
don't return absolute paths. We walk the file list enough times
already, there's no need to walk it for this.
(filter_mail): Skip . and .. in the @msglist.
(MAIN): Instead of passing getfiles $MAILDIR/new, chdir there and pass '.'.