POP3 and Spam Training

I’m in the process of testing a VirtualMin Pro installation for the eventual migration of my existing server. I don’t have many “clients” - just some family and friends. Some of my users use IMAP, and so the Spam training via Usermin works great. Log into Usermin, tell it what’s spam, and Bob’s your uncle.

However, about half of my users use POP3. On my current server, I use a program called Maia Mailguard (an older version, but the current version works much the same way) to do my Spam/Virus filtering, and the one feature that it does that really helps is it keeps a copy of all email a box receives in a database, Spam is stopped there, HAM is passed through but a copy kept, then when the spam/ham in the DB reaches a point it starts to send daily reminders that “Hey, you’ve got spam, you should come take a look and tell me if any of the Spam I’ve kept is actually ham, and vice versa.” The user goes in, and with a few clicks, the mail is purged, and the bynesian filter is updated and Spam is reported back to Spam Assassin. Also, the spam that it’s storing doesn’t come out of the user’s mail quota because it never touches their box.

Now, I like Usermin for what it does, but one thing that I feel might become an issue for my POP3 users is that unless they go through their box using Usermin first, they’ll never have the opportunity to train their personal spam filter.

What are other users doing in this situation?

What would you want users to do in such a circumstance? By that, I mean, what does Mailguard provide that Usermin doesn’t? (I’m wholly unfamiliar with Mailguard, so you’ll have to fill me in.)

I would assume that a mail browsing interface, like Usermin, would be the intuitive way to flag messages. You want them to be able to select messages that are spam and report them as such, right? I’m not sure how else one would do it, without looking at the mailbox in such a tool.

It’s also possible to setup an address that folks can forward spam to and have it added to the bayesian database (though this requires some care to be sure non-spammy characteristics don’t get pulled in as being spammy–forwards from your users could begin to look very spammy if it isn’t handled correctly). But, the fact is, whenever I’ve implemented such a system in the past, no one uses it–it’s just too unintuitive. Users are much more likely to use a web-based tool.

Here’s the process flow for how Maia Mailguard works:

  1. Mail received at server for user.
  2. Mail is sent to a forked amavis-new daemon where it is scanned for viruses (clamav) and spam content (spamassassin).
  3. If a mail is found to have a virus, it is quarantined in a database, and not delivered to the users mailbox.
  4. If a mail is found to have spam content, the rating is compared against the minimum rating set by the user, and is quarantined in a database, and not delivered to the users mailbox.
  5. If a mail is found to be ham, the original message is delivered to the users mailbox, and a copy is kept in the database.
  6. Once the quarantine reaches a certain message count, an email is sent to the user advising them to check their quarantine - which is web based. The spam quarantine is examined, and anything found to be ham is immediately delivered to the mailbox. Otherwise, it’s cataloged, reported, and deleted. Also, while there, the user can report any ham that should have been classified as spam, and the database is purged of their mail.

For me, I use IMAP, so the way Usermin works is great, with the possible exception that spam is using some of my mailbox quota. But today, for example, I had 4 messages come in that should have been caught as spam but were not. If I were using POP3, I would have no easy way using Usermin to classify those messages as spam so that next time I received a similar message it would be caught. I can verify that stuff that’s already been classified as spam continues to be so, and that’s about it.

I know you’re probably tired of hearing about this, but for at least one of my users, it’s something of a sizable issue. She has a very common name, and so gets alot of automatically generated spam. And alot of regular mail. In 3-4 days I’ve seen her collect over 200MB of mail, with perhaps 25-50MB of it being spam.

In the end, I love how Virtualmin is working for just about everything on the server. Thus far, with the possible exception of the Turba install script problem, everything that I’ve needed to have done, was quickly able to be done, and was pretty close to the way I’m used to doing things. But what I have to decide is, do I try and disable the virus/spam filtering that Virtualmin does, and try and integrate Maia Mailguard, or can I make things work with the stock software?

Thank you for listening to me ramble on, if you want a test account on my live server (the one I’m replacing with the Virtualmin server) so that you can see a slightly older version of Maia Mailguard running, I’d be happy to set you up with one. Just email me.