Spamassassin does not seem to be effective - did I do somethign wrong?


I recently installed and enabled spamassassin. I have been feeding ti (via sa-learn) message to learn, but it still catches, at best 25% of the spam. Following are some recent ‘outputs’:

[root@mysystem ~]# sa-learn --sync
expired old bayes database entries in 10 seconds
153966 entries kept, 137122 deleted
token frequency: 1-occurrence tokens: 59.15%
token frequency: less than 8 occurrences: 19.00%

[root@mysystem ~]# sa-learn --dump magic
0.000 0 3 0 non-token data: bayes db version
0.000 0 8903 0 non-token data: nspam
0.000 0 2016 0 non-token data: nham
0.000 0 153966 0 non-token data: ntokens
0.000 0 1452365227 0 non-token data: oldest atime
0.000 0 1452537695 0 non-token data: newest atime
0.000 0 0 0 non-token data: last journal sync atime
0.000 0 1452539293 0 non-token data: last expiry atime
0.000 0 172800 0 non-token data: last expire atime delta
0.000 0 137122 0 non-token data: last expire reduction count

I am wondering why it seems to be deleting almost as many entries as it is keeping. Is this normal? Are there any settings I should adjust? Based on my reading, I did see mention that the local setting should be turned off, but that does not appear to be set on my system.

Thanks for any suggestions you can provide.

I don’t know if this is a bug or just something I am not doing correctly but I run spamassassin as a deamon (spamd) and cannot seem to get it to use user specific configurations - it seems to only use the server config (set from the Webmin->Servers->Spamassassin Mail Filter page). I set up Bayesian filtering to auto-learn with a limit of 10 for definite spam and -3.4 for non-spam which works ok.
I had to add my own filters to get a decent hit rate. It’s a bit ‘R’ rated so I won’t paste it here.