Base Rates: A Cautionary Tale
The other day, I was reading a wikipedia article related to a topic we had been discussing in one of my classes. One of the statements in the second section confused me, and after a bit of thought I was convinced that it was indeed a mistake. Looking at the history, I noticed that this mistake was the result of an edit that had been made the day before.
Naturally, I reverted the article to the previous version. Looking at the history again, I noticed that the mistake had come from someone with an IP address very similar to my own. A quick search revealed that this person was in Philadelphia.
I decided that I was about 60% sure that it was someone in my class. Immediately I singled out a single person with 30% confidence.
There are about 1.5 million people in Philadelphia. There are about 15 people in my class. It would take a likelihood ratio of about 100,000 to pick out my class, and a likelihood ratio of about 1.5 million to pick out one person.
In class the next day, when I asked if anyone had edited wikipedia recently, they all said no.
And that’s how I lost 1.3 bits from my Bayes score.