Why the basics of probability are not so hard
this is not a puzzle or a fallacy in the strict sense of the word, but
there is something unusual in the fact that so many people find it so
difficult to understand the basics of probability.
In fact, I
am going to try to convince you now that it is all so straightforward
that you probably already know the basics.
What we are interested in is measuing uncertainty about unkown events. We have already discussed the difference between the frequentist and subjective approach to this, but were careful in that discussion not to mention the word probability.
If you follow the lesson presented below you will find out exactly what
probability means. What you will learn is that, informally, probability
is a measure (between 0 and 1)
of the uncertainty of an event, but that this informal definition does
not satisfy mathematicians because it is too vague. Instead, we need to
describe some properties (called axioms) that any reasonable measure of
uncertainty should satisfy; then we define probability as any
measure that satisfies those properties. The nice thing about the way
the mathematicians define it is that not only does it avoid the problem
of vagueness, but it also means that we can have more than one measure
of probability. In particular both the frequentist and subjective
approaches satisfy the axioms and hence both are valid ways of
First let us recap. We have looked at statements like the following
- "There is a 50% chance of tossing a
Head on a fair coin"
- "There is a 0.0000001% chance of Martians landing on earth this year"
In fact, it turns out that whether we are using the frequentist approach or the subjective approach, the following is true:
|Point 1: Statements expressing our uncertainty
about some event can always be expressed as a percentage chance that
the event will happen.
Sometimes this important point can be missed because bookies use what
appears to be a very different way of measuring uncertainty, with
- "The odds against Spurs winning the FA Cup this year are 4 to 1"
In fact, when bookies express their uncertainty in this way what they mean is that it is 4 times
more likely that the event will not happen than it will happen. But
that is the same as saying there is "1 in 5 chance" it will happen, or equivalently a "20% chance". Note
that the bookies would express the chance of tossing a Head on a fair
coin as "evens" which means 1 to 1, i.e. the chance of it not happening
is the same as the chance of it happening.
thinking of the frequentist view of probability (the percentage of
times the event happens) the following point should also be clear:
|Point 2: The percentage that expresses our uncertainty about an event can never be more than 100.
But even for the subjective view this seems reasonable. Why? Because the most certain we can ever be about an event happening is 100%. And events that have a 100% chance of happening are called certain
events. For example, the chance of rolling a number between 1 and 6 on
a normal (6-sided) die is 100% because we are actually certain it will
happen and we cannot be more certain than that. It might not be certain
that it will rain in Manchester on at least one day next year, but it
is surely extremely close to 100%; what it cannot be is any more than 100%
people say things like "I am 110% sure of this" that is nothing more
than a figure of speech. Indeed it is no more meaningul than a football
manager who says he expects 110% effort from each of his players. This
does not mean that all statements containing percentages over 100% are
meaningless. It is correct to say that "our profits increased by 200%
this year" if the profits were this year were twice what they were the
previous year. But statements like that are not quantifying uncertainty
about an event - they are making a comparison between two figures.
In much the same way the uncertainty about an event can never be more than 100%, we also have:
|Point 3: The percentage that expresses our uncertainty about an event can never be less than 0.
Again this is clear for the frequentist view, but it is also reasonable
for the subjective view. Why? Because the least certain we
can ever be about an event is 0%; such events, like rolling the number
7 on a normal six-sided die are certain not to happen, so the chance
they happen is 0%.The chances of English-speaking Martians landing on earth next year might not be 0% but must be very close to 0%; what it cannot be is any less than 0%.
So, having discovered that the uncertainty about any event can be
expressed as a percentage between 0 and 100 we can now make the
following simple observation.
|Point 4: If you divide the percentage uncertainty of an event by 100 you must end up with a number that lies between 0 and 1. The number you end up with is called the probability of the event.
So, for example, the probability of tossing a Head on a fair
coin is 0.5 and the probability of Spurs winning the cup next year
(according to the Bookies mentioned above) is 0.2. The probability of
rolling a number between one and six on a normal die is 1, whereas the
probability of rolling a seven is 0.
If you have followed the discussion do far, then you are already well
on the way to understanding what mathematicians refer to as the axioms
of probability. In fact the first axiom of probability is the
following statement that is essentially Point 4 above:
|Probabiliy Axiom 1: The probability of
any event is a number between 0 and 1
(mathematicians write this as: "0
<= P(E) <= 1" where P(E) denote the probability of an event E)
It is worth pointing out that when mathematicians can prove something
they call it a theorem. If they cannot prove it, but need to assume it,
they call it an 'axiom'. We know that there are different ways of
defining probability (frequentist and subjective) and that in both
approaches you might end up with different numbers for the same event.
The nice thing about an axiom like axiom 1 is that what it really means
is that no matter how you define the probability this statement should
be true. It is 'true' for existing measures and needs be true for other
measures of probability that have not yet been invented (otherwise we
will not call such measures probability measures).
The second axiom of probability is the following statement that follows directly from our discussion surrounding Point 2 above:
|Probabiliy Axiom 2: The probability of
any certain event is 1
(mathematicians write this as: "P(E) = 1 where an event E is certain")
The only core concept left to explain is what happens with probabilities of more than one event.
If you roll a normal die once then the following are two uncertain events that can happen:
There are many other uncertain events that can happen of course (such
as "rolling the number 2", "rolling the number 3" etc, or "rolling a
number that is bigger than 2" etc). What we can say about the two
particular events listed above is that they cannot both happen. At most one of these events can happen. When two events cannot both happen we call them mutually exclusive. Compare the above two events with the following two:
- You roll the number 1
- You roll the number 6
- You roll a number more than 1
- You roll a number less than 4
These two events are not
mutually exclusive because it is possible for both of them to happen as
a result of one roll of the die - if you roll either a 2 or a 3 then
both of the events happen.
A special, but very important, case of two mutually exclusive events is where one event is the 'negation' (also referred to as complement) of the other. For example:
The nice thing about any two mutually exclusive events is that it is very easy to answer the following question:
- the negation of the event
"roll the number 1" is the event "do not roll the number 1", or more
explicity "roll any number other than 1"
- the negation of the event "Spurs win the FA Cup next year" is "Spurs do not win the FA Cup next year"
- the negation of the event "toss Head" is "do not toss Head", or explicity "toss Tail".
What is the probability of either one of the events happening?
For example, what is the probability of rolling either a 1 or a 6 on a fair die?
Using the frequentist approach it is clear that the probability of
rolling a one or a six is 1/3, because the one should occur with a
frequency of 1 in 6 and a six should occur with a frequency of 1 in
6. So it turns out that the probability of the combined event is
the sum of the probabilities of the individual events. This will always
be the case for mututally exclusive events using the frequentist
approach. It therefore seems reasonable to expect this of any measure
of probability. Hence we have the third and final axiom of probability:
|Probabiliy Axiom 3: For any two mutually exclusive events the probability of either event happening is the sum of the probabilities of the individual events.
(mathematicians write this as: "P(E1 or E2) = P(E1) + P(E2) where an event E1 and E2 are mutually exclusive events")
far as mathematians are concerned they need never worry about trying to
explain probability in terms of frequentist or subjective views. All
they do is define probability as any any measure P of uncertainty that
satisfies axioms 1, 2 and 3.
So, although most people think
mathematics is hard, the mathematicians actually have it easy because
they can avoid all the discussions here and just look at the three axioms.
And we can all benefit from the
mathematical approach because what mathematicians do once they have a
set of axioms is they start to derive theorems that follow
logically from the axioms. One great example of such a theorem
tells us how to calculate the probability of the complement
of an event. For example, as we saw above that the complement of the
event is "roll the number 1 on a die" is the event "roll a number not
equal to 1 on a die". In this case it is easy to see that the
probability of the complement is 5/6. That means the probability of the
complement is one minus the probability of the event. In fact, this is
true of any event:
|A probability theorem: The probability of the complement of an event is equal to one minus the probability of the event.
(mathematicians write this as: "P(E) = 1 - P(not E)"
This theorem can be proved from the axioms as follows:
- Any event E and its complement (not E) are mutually excluse.
- From point 1 and axiom 3 the probability of the combined event "E
or not E" is equal to the probability of E plus the probability of not E
- But the event "E or not E" is certain, therefore its probability (from axiom 2) is equal to one.
- From 2 and 3 we conclude that the probability of E plus the probability of not E is equal to one.
- From 4 we conclude that the probability of not E is equal to one minus the probability of E.
Mathematicians would write the above proof more concisely as:
- For all E, events E and not E mutually exclusive
- P(E or not E) = P(E) + P(not E) (from 1 and axiom 3)
- P(E or not E) = 1 (axiom 2)
- 1 = P(E) + P(not E) (from 2 and 3)
- P(not E) = 1 - P(E) (from 4)
Return to Main Page
Making Sense of Probability: Fallacies, Myths and Puzzles