On Thu, 01 Nov 2001, Ken Guest wrote:
> I've tried google and freshmeat, searching for a tool to remove
> duplicate mails from mbox files to no avail.
> Does anybody know of such a utility, or do I need to cobble one
> together?
Formail (part of procmail) should do it. With the -D option, formail
keeps a cache of Message-IDs it has seen, which is used to check for
duplicate messages. When used with the -s option (splitting), formail
won't output duplicate messages. You need to specify the size and
filename for the Message-ID cache.
For example (with an ID cache called msgid.cache of size 8192):
formail -D 8192 msgid.cache -s < oldmboxfile > newmboxfile
John.
--
John Gaughan, Systems Administrator
Irish Times New Media - http://www.ireland.com/
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!