Wednesday, October 21, 2009

SpamBayes performance on Outlook 2007

I installed SpamBayes for Outlook 2007 eight days ago and have been using it since then. I can understand the reluctance of users to installing this software because it's not trivial to understand how it works or exactly why it would be better than the spam software built into Outlook.
Before I installed SpamBayes, Outlook was catching about half of the spam emails that I received and putting about 5 or 6 legitimate emails into the Junk Mail folder. After SpamBayes took over it got a few wrong and put a number of emails into the Junk Suspects but once it had finished training in about 3 or 4 days it was getting almost everything right. Since then it's had about an average of 1 a day wrong or unknown and the gap between those is getting longer.
If you're an Outlook user and you haven't installed SpamBayes yet then I highly recommend that you do so. It's free and available here from SourceForge and it will repay your time-investment within the first week.
The key difference between SpamBayes and Outlook's default spam filter is that SpamBayes learns what spam means to you without you having to classify subject types and/or from emails as blacklisted. You don't have to setup rules. For example, I get a lot of spam email that purports to be from myself. I also send myself legit emails about things I want in my email. Spammers know this and that's why they spoof my email address because they know that I won't block my own email address. SpamBayes learns that it should place zero significance on who the email's from and instead focuses on other markers in the email to learn from you what spam and ham (good email) mean to you.
If, for example, you work in the porn (or anti-porn) industry then emails with the word "porn" may be legitimate emails that you want to and need to read. They may be from co-workers of yours. However, there may also be plenty of spam about porn that you don't want to read. SpamBayes takes care of that for you because it learns that distinguishing spam from ham based (in part) on the word porn for your email is useless and will therefore automatically ignore this word and focus on other words (and markers) in your email to make this distinction. In my personal email however it would probably classify the word "porn" as a red flag that the email is spam.

No comments:

Post a Comment