Thank you SpamAssassin, again!

Every Monday morning it’s the same. There’s a pile of spam and it numbers in the hundreds of emails. Thankfully almost all of it was caught by SpamAssassin.
So, for those of you interested:

  1. Regular email: 1.3MB
  2. Spam email: 4.2MB
  3. Total spam: 507 emails.
  4. Spam that auto-trained: 400.
  5. Spam to my inbox: 3

Gives one a good feeling when you’re winning the battle on a daily basis. The war is another matter unfortunately.

Spamassassin – scoring on DSL lines

Since upgrading to Spamassassin 2.60 yesterday I’ve noticed a (small) increase in false positives. There were only 4 out of 132 spams caught overnight, but almost all were from dsl or dynamic IP addresses. The default score for this test is 2.5, but if you add the following to /etc/mail/spamassassin/local.cf you can change the score:

score RCVD_IN_DYNABLOCK 0 1 0 1

That’ll give it a ’1′ instead of 2.5 which is probably more reasonable. (Ironically, most of the emails caught were from “Karsten M. Self”, a critic of TMDA, who posts directly from his dial-up machine!)

New SpamAssassin Out!

Version 2.55 of SA is out. The release notes are a bit terse, but the notes for 2.54 indicate this is a release worth installing. It adjusts some spam rules spammers were using to get past SA!

spammers have been targeting our nice rules to get themselves negative overall scores, so those rules are now much less strongly-scored. also added a “TOO_MANY_MUA” rule that will catch multiple user agent headers.

Go download it now!

SpamAssassin 2.51

Note to self, install this at work on Monday: SpamAssassin 2.51 (via Dangerous Meta)
Over the last few days a lot more spam has got through to my inbox, and it seems to be after I installed version 2.50 of SA. This could be because I was still training the Bayesian filters. I also blacklisted *@artist-server.com as they were very persistent in spamming me. That helped, and putting the threshold down to 4.5 caught 2-3 spams. Today was better. Perhaps the Bayesian filters are working now!
On another related matter, I configured Goldmine to filter out spam, but it’s unusable. Goldmine has to create a new identity for each new email address so it’s easier to delete the email “online” before it’s downloaded. (If you knew Goldmine you’d know what I mean, it sucks!)

SpamAssassin – web based Bayesian training?

The latest release of SpamAssassin has support for Bayesian analysis. You have to train it and it gets better.
The only problem is SpamAssassin uses a command line app, sa-learn, to learn about your mail. Who’ll volunteer to create a web-based form to copy and paste spam/legitimate mail to train it? Adding an “upload file” button would be great too to for those mass-mail learning situations when you come into work in the morning..
It should be easy enough, although you’ll have to use su-exec or something to add rules to different user’s accounts. So, who’s up for it then?