I've been using POPFile as my new Spam filter. It uses Bayes Theorem techniques to build buckets for classifying email types. I basically set up two buckets, mail and spam and used 2000 legitimate and 2000 spam messages to train POPFile.

POPFile is a Perl script that works as a POP3 proxy. It uses statistical probability based on the training set to determine whether new mail is classified as mail or spam and tags messages with an altered subject or with the header X-Text-Classification. I use the latter method since Mozilla (my mail client) can filter based on mail headers.

So far, out of about 250 email I've had four false positives and one false negative. I'd rather have it the other way around but each false classification is collected by myself and re-inserted back into the proper training set.

Popularity: unranked [?]


SPEAK / ADD YOUR COMMENT
Comments are moderated.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Return to Top

POPFile

FRESH / LATEST POSTS

FEATURED / BEST OF mglenn.com

The Developer Hat
Native WYSIWYG
Embrace file-sharing, or die
Implicit Saves
Sun's Internal Java Memo.
Google's SafeSearch Criticized
Daylight Saving Time
Back in Two (Months)
No Time! No Time!
Backed up and restored!

FOLLOW / YOUR COMMENTS

TAG / CLOUD

Tags