Bayes filtering, invented by...
Thomas Bayes, shorely.
Actually, this publication from Microsoft Research predates Paul Graham's "plan for spam":
Jason Rennie published a paper on Bayesian filtering of e-mail in 1998, and has been working on his implementation of it (called ifile) since at least 1997.
Paul Graham himself mentions that there were others (specifically at MS) who worked on Bayesian filtering before himself. The problem wasn't the idea -- it was the implementation. I
How sad that Microsoft was sitting on this idea for years, but couldn't develop a halfway decent spam filter for Outlook until Outlook 2003.
> The problem wasn't the idea -- it was the implementation.
Invented? No. Popularized? Yes, sure.
"Thomas Bayes, shorely."
It's always puzzled me why it's called Bayesian filtering when Paul Graham's article doesn't contain anything that even looks like an example of Bayes' theorem.
"bayesian" in this sense is just an adjective that means "uses subjective probabilities". any technique that builds off of the idea that you update the likelihood that a given statement is true (i.e. that a message is spam) based on new evidence can be said to fall under bayesianism.
Credit correctly to go Paul Graham because after he published "A Plan for Spam" there was an explosion of open source efforts to control spam by statistical means.
Check citeseer -- there's a whole stack of papers that predated PG's work regarding the application of Bayesian classifiers to spam filtering.
Probably the main factors were the interesting-ness of Graham's writing and the timeliness of him releasing the paper - just about the time when spam became a major headache.
Don't filter spam, just legislate it away:
Naive Bayes for classifying text has been around for a while - I read about it in Tom Mitchell's Machine Learning textbook five years ago: http://www-2.cs.cmu.edu/afs/cs/project/theo-11/www/naive-bayes.html
Fog Creek Home