-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
alright, I'll bite...
it looks like somehow the locking failed, and it allowed you to overwrite
parts of the db while another process wrote at the same time; obv. this
isn't supposed to be possible. ;) (is this in 3.0.x? or 2.6x?)
- --j.
Niall O Broin writes:
> I sent this the other day, but nobody bit. In the hopes that it was the
> weekend lull, I'm repeating myself :-)
>> Thanks to contributions from a couple of people the other day I came up
> with
> this little script to produce a small report on the Bayes DB:
>> echo Spam Assassin Bayes Statistics
> echo ""
> echo Bayes Token Count
> echo "Total Ham Spam"
> sa-learn --dump |awk '{count += 1; if ($0 > 0.5) spam+=1; \
> if ($0 < 0.5) ham+=1} END {print count "\t" ham "\t" spam}'
> echo ""
> echo -n "Number of ham messages learnt from: "
> sa-learn --dump magic |awk '/nham/ {print $3}'
> echo -n "Number of spam messages learnt from: "
> sa-learn --dump magic |awk '/nspam/ {print $3}'
>> which runs at tne end of a script which sa-learns spam placed in
> folders by
> humans during the day. After doing its nightly run, it reported as
> follows:
>> Spam Assassin Bayes Statistics
>> Bayes Token Count
> Total Ham Spam
> 140114 78443 61671
>> Number of ham messages learnt from: 2109
> Number of spam messages learnt from: 1387
>> I then fed sa-learn something over 1000 pieces of ham, and now the same
> script
> gives me:
>> Spam Assassin Bayes Statistics
>> Bayes Token Count
> Total Ham Spam
> 153518 10 153508
>> Number of ham messages learnt from: 2850
> Number of spam messages learnt from: 0
>> AARGH! - what the hell has happened there. It has forgotten about ALL
> the spam
> messages it ever learnt from, apparently, but conversely, 78000 ham
> tokens
> have become spam tokens. Did SA somehow choke on all that ham?
>> Straight sa-learn --dump magic now gives
>> 0.000 0 3 0 non-token data: bayes db version
> 0.000 0 0 0 non-token data: nspam
> 0.000 0 2850 0 non-token data: nham
> 0.000 0 153508 0 non-token data: ntokens
> 0.000 0 1091609393 0 non-token data: oldest atime
> 0.000 0 1107564300 0 non-token data: newest atime
> 0.000 0 1107564852 0 non-token data: last journal sync
> atime
> 0.000 0 1107564590 0 non-token data: last expiry atime
> 0.000 0 1382400 0 non-token data: last expire atime
> delta
> 0.000 0 17827 0 non-token data: last expire reduction
> count
>> whereas sa-learn --dump magic from the databases as of 19:00 last night
> (retrieved from the warm standby box) gives
>> 0.000 0 3 0 non-token data: bayes db version
> 0.000 0 1342 0 non-token data: nspam
> 0.000 0 2096 0 non-token data: nham
> 0.000 0 138010 0 non-token data: ntokens
> 0.000 0 1106096390 0 non-token data: oldest atime
> 0.000 0 1107544172 0 non-token data: newest atime
> 0.000 0 1107538029 0 non-token data: last journal sync
> atime
> 0.000 0 1107478750 0 non-token data: last expiry atime
> 0.000 0 1382400 0 non-token data: last expire atime
> delta
> 0.000 0 5589 0 non-token data: last expire reduction
> count
>> Can anyone shed any light on this?
>> --
> Niall
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS
iD8DBQFCCCHzMJF5cimLx9ARAkU2AKCbjvM2dcQV8hlI+gcyAsluwsBosgCguDl2
9o6oDPnBhL7/SEdlgQrw8ME=
=vyx2
-----END PGP SIGNATURE-----
Maintained by the ILUG website team. The aim of Linux.ie is to
support and help commercial and private users of Linux in Ireland. You can
display ILUG news in your own webpages, read backend
information to find out how. Networking services kindly provided by HEAnet, server kindly donated by
Dell. Linux is a trademark of Linus Torvalds,
used with permission. No penguins were harmed in the production or maintenance
of this highly praised website. Looking for the
Indian Linux Users' Group? Try here. If you've read all this and aren't a lawyer: you should be!