Quoting Kenn Humborg <kenn at bluetree.ie>:
> >
> > > Justin Mason writes
> > >
> > > - --mbox switch is required when working on mboxes.
> >
> > sorry that was a typo above, I forgot to add in that I was using
> > that option as
> > well, but its more to do with the fact that spamassassin current
> > can't handle
> > imap folders at the moment.
>> You're mixing your terminology here. When you're looking at
> the raw file(s) that make up a mail "folder", IMAP doesn't
> come into the picture. You've box mbox folders, Maildir folders
> and multiple others that I can't remember the names of right
> now.
>> You'll recognize Maildir folders by the cur/, new/ and tmp/
> subdirs.
What I have is a number of mail folders under ~/mail/, each mail folder is
basically a file that contains all the mails and the problem that spamassassin
has in reading in from these directly is with the following folder header
>From MAILER-DAEMON Wed Mar 3 10:38:52 2004
Date: 03 Mar 2004 10:38:52 +0000
From: Mail System Internal Data <MAILER-DAEMON at compsoc.nuigalway.ie>
Subject: DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA
Message-ID: <1078310332 at compsoc.nuigalway.ie>
X-IMAP: 1077182741 0000000428
Status: RO
This text is part of the internal format of your mail folder, and is not
a real message. It is created automatically by the mail system software.
If deleted, important folder data will be lost, and it will be re-created
with the data reset to initial values.
Once it reads this part it sees all the mails in the folder as 1 huge mail. As
far as I can tell icat basically just strips off this header from the file
(mail folder) and the result is in standard mbox format which can then be fed
into spam-assassin.
> Are you saying that spamassassin doesn't support Maildirs?
> Maybe not directly, but it's just a matter of something like:
>> find ~/Maildir -type f | xargs spamassassin --ham
>> "This is Unix - stop being so helpless" :-)
>> Later,
> Kenn
>
We're not using Maildirs, but it does appear to be a non standard mbox folder,
i.e. one with additional data added to support imap folder access.
>From some of the replies I've received, it appears that use of the mbox switch
requires the input to be in a file due to the functions it uses which don't
work with pipes.
I could use formail to separate the mails out and call sa-learn with each mail
but thats really inefficent.
(I just took the next part from a later mail instead of replying separately)
Quoting John Allen <john.allen at dublinux.net>:
> On Tuesday 02 March 2004 20:41, Paul Jakma wrote:
> > On Tue, 2 Mar 2004, John Allen wrote:
> > > Q: What is an *IMAP* folder.
> >
> > A: It's a folder accessible through IMAP (surely?)
> >
>> Funny. But this was in the context that this guy was trying to pass the
> folder
> directly to spamassassin with the --mbox switch, which means that it must be
>> an mbox, which many IMAP folders are *NOT*
Had a chat with one of the admins who I also work with on the website, and I
think we both agree that its appears to be due to the extra data at the start
of the folder as describe above. Now we do use horde & imp to provide a webmail
frontend. I use it in combination with pine/mutt when I ssh into the server to
read mail. pine/mutt when there's a lot to be read and I'm on a slow connection
since its so much faster than using the web interface.
Now the extra data in the mbox folder does appear to be due to the use of imap
on the server since it provides a standard interface for all mail clients to
access the mail, but its also what is causing sa-learn to read the directory
incorrectly.
Since its an mbox, I need to use the --mbox option but the extra data is
throwing sa-learn off. Out side of this since with mbox I can't just pipe the
mail in, I think that the way I'm doing it at the moment is the only workable
method until either we change our mail folder storage system or sa-learn is
modified to be able to use the imap system so that the extra data is no longer
a problem.
Am i missing something, can sa-learn handle the extra imap folder data
correctly, or is it that I'll just have to stick with creating a temporary file
that contains the mbox without the extra imap folder and feed the temp mbox
into sa-learn and then delete it.
--
Darragh
"Nothing's foolproof to a sufficently talented fool"
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!