=head1 SYNOPSIS
-B<minc> [B<-m> I<MAX>] [B<-n>] [B<-p>]
+B<minc> [B<-m> I<MAX>] [B<-n>]
B<minc> B<-d>
-B<minc> B<-h>
+B<minc> B<-r>
=head1 DESCRIPTION
B<minc> incorporates mail from a maildir to a mh folder hierarchy. It
takes mail from a maildir folder (not a maildir folder hierarchy),
-optionally checks for spam with a user-defined spam-checking function,
-and optionally filters mail into separate mh folders.
-
-The filtering is quite sophisticated, as it is done using real Perl
-matching (m//) commands.
+running each message through regular expression based filter and hook
+functions to determine in which folder to store it or whether it's
+spam. Post-processing hooks may be applied to each message.
+
+As it processes each message, B<minc> prints a line for each message
+similar to B<inc(1)> and B<scan(1)>. This line includes the folder
+and message number in which the message was stored, the last 'From'
+header, and the last 'Subject' header. These fields are truncated to
+fit in the user's terminal (see COLUMNS in B<ENVIRONMENT> below) in
+the following proportions: folder (0.1), message number (0.0625), from
+header (0.175). Any of these may be overridden with $SCAN_P_FOLDER,
+$SCAN_P_MESSAGE, or $SCAN_P_FROM. The subject always fills out the
+rest of the line.
=cut
use Errno;
use Fcntl qw(O_WRONLY O_EXCL O_CREAT);
use FileHandle;
+use File::Temp qw(tempfile);
use File::stat;
use Getopt::Long qw(:config gnu_getopt no_ignore_case);
use POSIX qw(strftime WEXITSTATUS WIFEXITED);
=item B<-d>
Dump (using Data::Dumper) the FILTERS list and exit. This is useful
-for testing the syntax of .mincfilter.
+for testing the syntax of .minc.
=item B<-h>
Dry run; do not actually incorporate the mail, but log and report to
stdout/stderr as normal.
-=item B<-p>
+=item B<-r>
-Print the filename of each message before checking it for spam. This
-can be handy if a particular message is giving the spam checker a
-problem.
+Rebuild `mhpath +`/.folders from scratch, processing no mail.
=back
my $help;
my $maxmsgs;
my $norun;
-my $printfilenames;
+my $rebuild_dot_folders;
GetOptions(
'd' => \$dumpfilters,
'h|help' => \$help,
'm=i' => \$maxmsgs,
'n' => \$norun,
- 'p' => \$printfilenames,
+ 'r' => \$rebuild_dot_folders,
) or pod2usage();
$help and pod2usage(-exitstatus=>0, -verbose=>1);
@ARGV == 0 or pod2usage();
=over 4
+=item COLUMNS
+
+How many columns the user's terminal can hold, used to print scan
+lines for each processed message. Defaults to 80.
+
=item HOME
-Where configuration files (.mincfilter) are found. Also,
+Where the configuration file (.minc) is found. Also,
$HOME/Maildir is used for the maildir if MAILDIR is not set.
=item MAILDIR
=over 4
-=item $HOME/.mincfilter
-
-This file is Perl code (included via the 'require' directive) which is
-expected to define the FILTERS list.
+=item $HOME/.minc
-=item $HOME/.mincspam
+This file is Perl code (included via the 'require' directive) which
+may define the FILTERS list, @start_hooks, @filter_hooks,
+@post_store_hooks, and @stop_hooks.
-If this file exists, B<minc> will include it with the expectation that
-it will define a B<spam_check> function. This function takes a
-message filename as an argument and returns 1 if the message is spam,
-else 0. If this file does not exist, B<minc> will define a simple
-function that always returns 0.
+=item `mhpath +`/.folders
-One of B<minc>'s global variables is available to the user-defined
-B<spam_check> function: $run. This boolean should be honored;
-B<spam_check> should only take real action (i.e. removing or creating
-files, running external programs, etc.) if $run is non-zero.
-
-This file may also declare two other functions: B<spam_start_hook> and
-B<spam_stop_hook>. The former is passed no arguments and is expected
-to return a list. This list is a "baton" that will also be passed to
-B<spam_check> and B<spam_stop_hook>. It can hold anything
-B<spam_check> will need to do its job, whether network sockets, pipes,
-or whatever.
-
-XXX: need more details about the spam-handling process; for now read
-the code.
+B<minc> adds all folders it filters into to this file, which is used
+by lukem's B<new(1)> (XXX need a link).
=item `mhpath +`/logs/minc.log
Where B<minc> logs what it does, unless in -n mode.
-=item `mhpath +`/logs/dryrun.log
-
-Where B<minc> logs what it would do; used in -n mode.
-
=item `mhpath +`/.minc.context
B<minc> uses this file for context (i.e. current folder) instead of
our @FILTERS;
our (@start_hooks, @stop_hooks, @filter_hooks, @post_store_hooks);
+our $SCAN_P_FOLDER = 0.1;
+our $SCAN_P_MESSAGE = 0.0625;
+our $SCAN_P_FROM = 0.175;
+our @folder_sort_list = (qr/^inbox$/);
+our $folder_sorter = sub { sort_by_list(@_, @folder_sort_list) };
my $mh;
-my $logfile;
$mh = `mhpath +`;
chomp($mh);
-if ($run) {
- $logfile = $mh . '/logs/minc.log';
-} else {
- $logfile = $mh . '/logs/dryrun.log';
-}
+my $logfile = $mh . '/logs/minc.log';
$ENV{"MHCONTEXT"} = $mh . '/.minc.context';
# Logging
sub mylog {
+ $run or return;
+
my $timestamp = strftime('%b %e %H:%M:%S', localtime);
my $msg;
foreach my $part (@_) {
###############################################################################
# Utility procedures
+sub sort_by_list {
+ my $a = shift;
+ my $b = shift;
+
+ for my $i (@_) {
+ my $am = $a =~ $i;
+ my $bm = $b =~ $i;
+ if ($am) {
+ if ($bm) {
+ last;
+ }
+ return -1;
+ } elsif ($bm) {
+ return 1;
+ }
+ }
+
+ return $a cmp $b;
+}
+
sub mkfolder {
my $folder = shift;
my $target;
#sleep(2);
}
- if ($mhfolder ne 'SPAM') {
- mylog('+', $mhfolder);
- }
+ mylog('+', $mhfolder);
if ($run) {
if (not rename($msg, $mhmsg)) {
my $header;
my $contents;
my $pair;
- my $match;
+ my $regexp;
my $expression;
my $result;
if ($header =~ /$MAGIC_TO_REGEX/i) {
foreach $contents (@{$headers{lc($header)}}) {
foreach $pair (@filter) {
- ($match, $expression) = @$pair;
- if ($contents =~ /$match/) {
- return $expression;
+ ($regexp, $expression) = @$pair;
+ if ($contents =~ $regexp) {
+ if (eval "\$result = \"$expression\"") {
+ return $result;
+ }
}
}
}
# Walk the list of message headers matching the filter's
# specified header.
foreach $contents (@{$headers{lc($header)}}) {
- # Walk the filter's list of match/expression pairs.
+ # Walk the filter's list of regexp/expression pairs.
foreach $pair (@filter) {
- ($match, $expression) = @$pair;
- if ($contents =~ /$match/i) {
+ ($regexp, $expression) = @$pair;
+ if ($contents =~ $regexp) {
if (eval "\$result = \"$expression\"") {
return $result;
}
}
sub filter_mail {
- my @msglist = @_;
- my $msgcount = @msglist - 2; # don't count . and ..
+ my $msgcount = @_ - 2; # don't count . and ..
my $len = length($msgcount);
my @baton;
my $msg;
my $spam = 0;
my $saved = 0;
my $msgnum;
- my %FOLDERS = ('SPAM'=>1);
-
- # XXX lame names and hard-coded proportions.
- my $nf = int($COLUMNS * 0.1);
- my $nm = int($COLUMNS * 0.0625);
- my $nF = int($COLUMNS * 0.175);
- my $ns = $COLUMNS - $nf - $nm - $nF - 3;
+ my %folders;
if (-f "$HOME/.minc") {
require "$HOME/.minc";
}
+ # XXX lame names
+ my $nf = int($COLUMNS * $SCAN_P_FOLDER);
+ my $nm = int($COLUMNS * $SCAN_P_MESSAGE);
+ my $nF = int($COLUMNS * $SCAN_P_FROM);
+ my $ns = $COLUMNS - $nf - $nm - $nF - 3;
+
my %batons;
for my $hook (@start_hooks) {
my ($handle, @baton) = $hook->();
}
}
- foreach $msg (@msglist) {
+ for $msg (@_) {
($msg eq '.' or $msg eq '..') and next;
- if ($printfilenames) {
- print("$msg\n");
- }
-
my %headers = get_headers($msg);
log_headers(%headers);
defined($mhfolder) or ($mhfolder = find_mh_folder($msg, %headers));
$msgnum = store_message($msg, $mhfolder);
+ $folders{$mhfolder}++;
if ($mhfolder eq 'SPAM') {
$spam++;
} else {
$saved++;
- my $from = [@{$headers{'from'}}]->[-1];
- my $subject = [@{$headers{'subject'}}]->[-1];
+ my $from = '';
+ my $subject = '';
+ # Sometimes these headers are missing...
+ eval { $from = [@{$headers{'from'}}]->[-1] };
+ eval { $subject = [@{$headers{'subject'}}]->[-1] };
for ($from, $subject) {
tr/\x00-\x1f\x80-\xff/ /;
}
print(' ' x $COLUMNS);
printf("\r\%-${nf}s \%${nm}d \%-${nF}s \%s\n",
substr($mhfolder, 0, $nf), substr($msgnum, 0, $nm),
- # XXX shouldn't pop, as these are about to be
- # passed to post_store_hooks
substr($from, 0, $nF),
substr($subject, 0, $ns));
}
for my $hook (@stop_hooks) {
$hook->(\%batons);
}
+
+ return %folders;
+}
+
+sub build_dot_folders {
+ my $folders = shift;
+ my $fh = shift;
+ my $fn;
+
+ if (defined($fh)) {
+ while (<$fh>) {
+ chomp;
+ $folders->{$_}++;
+ }
+ }
+
+ eval { ($fh, $fn) = tempfile("$mh/.folders.XXXXX") };
+ if ($@) {
+ warn("$@");
+ return;
+ }
+
+ for my $folder (sort { $folder_sorter->($a,$b) } keys(%$folders)) {
+ print($fh "$folder\n");
+ }
+
+ if (not close($fh)) {
+ warn("close($fn): $!");
+ unlink($fn) or warn("unlink($fn): $!");
+ return;
+ }
+
+ rename($fn, "$mh/.folders") or warn("rename($fn, $mh/.folders): $!");
+}
+
+sub create_dot_folders {
+ if (-f "$HOME/.minc") {
+ require "$HOME/.minc";
+ }
+
+ my %folders;
+ open(my $fh, '-|', 'folders', '-fast', '-recur')
+ or die("open(folders|): $!");
+ build_dot_folders(\%folders, $fh);
+ return 0;
+}
+
+sub update_dot_folders {
+ my $folders = shift;
+ my $fh;
+
+ if (not open($fh, "$mh/.folders") and not $!{ENOENT}) {
+ # For ENOENT, we go ahead and create it, else we error and
+ # don't clobber it.
+ warn("open($mh/.folders): $!");
+ return;
+ }
+
+ build_dot_folders($folders, $fh);
}
\f
exit;
}
+ $rebuild_dot_folders and exit(create_dot_folders);
+
chdir("$MAILDIR/new") or die("chdir($MAILDIR/new): $!");
- filter_mail(map { $_->[1] }
+ my %folders = filter_mail(map { $_->[1] }
sort { $a->[0] <=> $b->[0] }
map {
if (not ($st = stat($_))) {
}
getfiles('.'));
+ $run and update_dot_folders(\%folders);
+
@SPAM and (exec('scan', '+SPAM', @SPAM) or die);
}
=head1 THE FILTERS STRUCTURE
-The user's .mincfilter file must define the @FILTERS structure. This
-structure is an array. Each element of @FILTERS is a filter. A
-filter is itself an array. The first element of a filter is a string,
-the name of the header this filter acts upon. The header name is not
-case-sensitive. Each subsequent element of a filter is a pair (i.e. a
-two-element array): first, a regular expression B<minc> uses to
-determine whether this filter matches or not, and second, an
-expression which B<minc> evaluates to get the folder name.
+Each element of @FILTERS is a filter. A filter is itself an array.
+The first element of a filter is a string, the name of the header this
+filter acts upon. The header name is not case-sensitive. Each
+subsequent element of a filter is a pair (i.e. a two-element array):
+first, a regular expression B<minc> uses to determine whether this
+filter matches or not, and second, an expression which B<minc>
+evaluates to get the folder name.
B<minc> decides where to store a message by iterating over the
@FILTERS array. It tests each regexp of each filter against all
including) the To. Which headers to use is determined by a regular
expression borrowed from procmail.
+=head1 HOOKS
+
+Filter hooks take a reference to a hash of batons, a reference to a
+hash of headers, and the message filename as arguments. It returns
+undef to decline filtering of this message (thus falling back to
+subsequent filter hooks, and finally @FILTERS), or the name of the
+folder to store this message into.
+
+One of B<minc>'s global variables is available to the user-defined
+hooks: $run. This boolean should be honored; hooks should only take
+real action (i.e. removing or creating files, running external
+programs, etc.) if $run is non-zero.
+
+The baton hash is created simply from the start hooks; if the hook
+returns at least one defined value, this value is used as the key and
+all other return values are put into a list reference as the value.
+This hash is then passed by reference to all filter, post-store, and
+stop hooks.
+
+Post store hooks take a reference to a hash of batons, a reference to
+a hash of headers, the folder this message was stored in, and its new
+message number.
+
+XXX: need more details about the hook process; for now read the code.
+
=head1 EXAMPLES
@FILTERS = (
list such as dev@httpd.apache.org, this filter will create the folder
name l/apache/httpd/dev.
-For an example B<spam_check> function, see
-L<http://pretzelnet.org/cvs/dotfiles/.mincspam>
+XXX Need hook examples.
=head1 AUTHORS