Base-rate neglect solution

 

 

The correct answer is 2%. An informal way of explaining this result is to think of a population of 10,000 people. We would expect just 10 people in this population to have the disease. If you test everybody in the population then the false positive rate means that, in addition to the 10 people who do have the disease, another 500 will be wrongly diagnosed as having it. In other words only about 2% of the people diagnosed positive actually have the disease. When people give a high answer like 95% they are ignoring the very low probability (i.e. rarity) of having the disease. In comparison the probability of a false positive test is relatively high.

A formal Bayesian explanation is as follows:

let A be the event 'person has the disease

let B be the event 'positive test'.

We wish to calculate p(A|B).

First we note that:

 

p(A|B) = 1 -p(NOT A|B)

 

In fact it is easier to calculate p(NOT A|B). By Bayes this is:

 

image\BBNs0073_wmf.gif

 

Now, we know the following:

p(A)=0.001

p(NOT A)=0.999

p(B| NOT A) = 0.05

p(B|A) = 1

Hence:

 

image\BBNs0074_wmf.gif

 

Hence p(A|B) is approximately 0.02.