-----BEGIN PGP SIGNED MESSAGE-----
Des Keane writes:
> On Fri, 4 Feb 2005 01:06:09 +0000, Niall O Broin <niall at linux.ie> wrote:
> > sa-learn can give info about its db with --dump and --backup, but
> > neither of those are very understandable (to me - I'm sure they speak
> > volumes to Jason :-) ). Is there any way to get SA to dump information
> > about the Bayesian db e.g. how many 'good' and 'bad' tokens there are,
> > how many messages have been fed to it as ham/spam?
>> You could start with "sa-learn --dump magic". The line with "non-token
> data: nspam" gives you the number of spam messages learned from,
> "non-token data: nham" the ham messages. "non-token data: ntokens"
> gives you the total number of tokens, though I'm not sure you can
> derive goodness from it.
that's pretty much it -- then the first column in the non-"magic" token
lines is the probability; < 0.5 and it's a hammy token, > 0.5 and it's
spammy. working the magic of wc -l et al is left to the reader. ;)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS
-----END PGP SIGNATURE-----
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!