LINUX.IE, website of the Irish Linux Users' Group
Tux rules!

   
Home
New Users
Articles
Download
Projects
Community
Vendors

  Print Version
Email to...
 
Archives:


planetILUG

Recent News

News Archive


Join the
ILUG
on FaceBook


Join the
ILUG
on LinkedIn


Join the
ILUG SETI
Group



















 
 :: Mailing Lists

[ILUG] Re: cohosh covert

[ILUG] Re: cohosh covert

Paul Jakma paul at clubi.ie
Wed Apr 14 19:45:36 IST 2004


On Wed, 14 Apr 2004, Ronan Cunniffe wrote:

> The whole point of their trick is to provide a statistically
> useless message body.  

Doesnt matter really... the point is to detect statistically
_meaningful_ words that indicate spammyness or non-spammyness of a
mail. The fluff doesnt (shouldnt at least) matter.

If the spammers 'stuff' their spam with random text, then all that
happens is that a bayesian filter will tend to score random text as
neutral, ie 0.5 probability. A decent bayesian filter will only use
phrases with indicative probabilities (ie high or low probabilities)
to construct the bayesian probability for the mail, and discard the 
neutral ones.

So text-stuffing wont really affect things much, well not when every 
spammer does it. What _will_ hurt bayesian filtering is if the 
spammers include the most minimal of spam payloads, eg just one url, 
especially if they do not reuse URLs (and spammers register lots of 
throwaway domains).

regards,
-- 
Paul Jakma	paul at clubi.ie	paul at jakma.org	Key ID: 64A2FF6A
	warning: do not ever send email to spam at dishone.st
Fortune:
No wonder Clairol makes so much money selling shampoo.
Lather, Rinse, Repeat is an infinite loop!



More information about the ILUG mailing list
Read this without the formatting.
                                                                                                    

 

Hosted by HEAnet


Maintained by the ILUG website team. The aim of Linux.ie is to support and help commercial and private users of Linux in Ireland. You can display ILUG news in your own webpages, read backend information to find out how. Networking services kindly provided by HEAnet, server kindly donated by Dell. Linux is a trademark of Linus Torvalds, used with permission. No penguins were harmed in the production or maintenance of this highly praised website. Looking for the Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!
RSS Version
Powered by Dell