Why the basics of probability are not so hard

dice

OK, this is not a puzzle or a fallacy in the strict sense of the word, but there is something unusual in the fact that so many people find it so difficult to understand the basics of probability.

In fact, I am going to try to convince you now that it is all so straightforward that you probably already know the basics.

What we are interested in is measuing uncertainty about unkown events. We have already discussed the difference between the frequentist and subjective approach to this, but were careful in that discussion not to mention the word probability. If you follow the lesson presented below you will find out exactly what probability means. What you will learn is that, informally, probability is a measure (between 0 and 1) of the uncertainty of an event, but that this informal definition does not satisfy mathematicians because it is too vague. Instead, we need to describe some properties (called axioms) that any reasonable measure of uncertainty should satisfy; then we define probability as any measure that satisfies those properties. The nice thing about the way the mathematicians define it is that not only does it avoid the problem of vagueness, but it also means that we can have more than one measure of probability. In particular both the frequentist and subjective approaches satisfy the axioms and hence both are valid ways of defining probability.

First let us recap. We have looked at statements like the following 

In fact, it turns out that whether we are using the frequentist approach or the subjective approach, the following is true:


Point 1: Statements expressing our uncertainty about some event can always be expressed as a percentage chance that the event will happen. 


Sometimes this important point can be missed because bookies use what appears to be a very different way of measuring uncertainty, with statements like:

In fact, when bookies express their uncertainty in this way what they mean is that it is 4 times more likely that the event will not happen than it will happen. But that is the same as saying there is "1 in 5 chance" it will happen, or equivalently a "20% chance".  Note that the bookies would express the chance of tossing a Head on a fair coin as "evens" which means 1 to 1, i.e. the chance of it not happening is the same as the chance of it happening. 

Now, thinking of the frequentist view of probability (the percentage of times the event happens) the following point should also be clear:

Point 2: The percentage that expresses our uncertainty about an event can never be more than 100.


But even for the subjective view this seems reasonable. Why? Because the most certain we can ever be about an event happening is 100%. And events that have a 100% chance of happening are called certain events. For example, the chance of rolling a number between 1 and 6 on a normal (6-sided) die is 100% because we are actually certain it will happen and we cannot be more certain than that. It might not be certain that it will rain in Manchester on at least one day next year, but it is surely extremely close to 100%; what it cannot be is any more than 100%

Although people say things like "I am 110% sure of this" that is nothing more than a figure of speech. Indeed it is no more meaningul than a football manager who says he expects 110% effort from each of his players. This does not mean that all statements containing percentages over 100% are meaningless. It is correct to say that "our profits increased by 200% this year" if the profits were this year were twice what they were the previous year. But statements like that are not quantifying uncertainty about an event - they are making a comparison between two figures.

In much the same way the uncertainty about an event can never be more than 100%, we also have:


Point 3: The percentage that expresses our uncertainty about an event can never be less than 0.

Again this is clear for the frequentist view, but it is also reasonable for the subjective view. Why? Because the least certain we can ever be about an event is 0%; such events, like rolling the number 7 on a normal six-sided die are certain not to happen, so the chance they happen is 0%.The chances of English-speaking Martians landing on earth next year might not be 0% but must be very close to 0%; what it cannot be is any less than 0%.

So, having discovered that the uncertainty about any event can be expressed as a percentage between 0 and 100 we can now make the following simple observation.

Point 4: If you divide the percentage uncertainty of an event by 100 you must end up with a number that lies between 0 and 1. The number you end up with is called the probability of the event. 

So, for example, the probability of tossing a Head on a fair coin is 0.5 and the probability of Spurs winning the cup next year (according to the Bookies mentioned above) is 0.2. The probability of rolling a number between one and six on a normal die is 1, whereas the probability of rolling a seven is 0.


If you have followed the discussion do far, then you are already well on the way to understanding what mathematicians refer to as the axioms of probability.  In fact the first axiom of probability is the following statement that is essentially Point 4 above:

 
Probabiliy Axiom 1: The probability of any event is a number between 0 and 1

(mathematicians write this as: "0 <= P(E) <= 1" where P(E) denote the probability of an event E)

It is worth pointing out that when mathematicians can prove something they call it a theorem. If they cannot prove it, but need to assume it, they call it an 'axiom'. We know that there are different ways of defining probability (frequentist and subjective) and that in both approaches you might end up with different numbers for the same event. The nice thing about an axiom like axiom 1 is that what it really means is that no matter how you define the probability this statement should be true. It is 'true' for existing measures and needs be true for other measures of probability that have not yet been invented (otherwise we will not call such measures probability measures). 

The second axiom of probability is the following statement that follows directly from our discussion surrounding Point 2 above:
 
Probabiliy Axiom 2: The probability of any certain event is 1

 (mathematicians write this as: "P(E) = 1 where an event E is certain")

The only core concept left to explain is what happens with probabilities of more than one event

If you roll a normal die once then the following are two uncertain events that can happen:
  1. You roll the number 1
  2. You roll the number 6
There are many other uncertain events that can happen of course (such as "rolling the number 2", "rolling the number 3" etc, or "rolling a number that is bigger than 2" etc). What we can say about the two particular events listed above is that they cannot both happen. At most one of these events can happen. When two events cannot both happen we call them mutually exclusive. Compare the above two events with the following two:

  1. You roll a number more than 1
  2. You roll a number less than 4

These two events are not mutually exclusive because it is possible for both of them to happen as a result of one roll of the die - if you roll either a 2 or a 3 then both of the events happen.

A special, but very important, case of two mutually exclusive events is where one event is the 'negation' (also referred to as complement) of the other. For example:

The nice thing about any two mutually exclusive events is that it is very easy to answer the following question:

What is the probability of either one of the events happening?

For example, what is the probability of rolling either a 1 or a 6 on a fair die?

Using the frequentist approach it is clear that the probability of rolling a one or a six is 1/3, because the one should occur with a frequency of 1 in 6 and a six should occur with a frequency of 1 in 6.  So it turns out that the probability of the combined event is the sum of the probabilities of the individual events. This will always be the case for mututally exclusive events using the frequentist approach. It therefore seems reasonable to expect this of any measure of probability. Hence we have the third and final axiom of probability:


 
Probabiliy Axiom 3: For any two mutually exclusive events the probability of either event happening is the sum of the probabilities of the individual events.

 (mathematicians write this as: "P(E1 or E2) = P(E1) + P(E2)  where an event E1 and E2 are mutually exclusive events")

As far as mathematians are concerned they need never worry about trying to explain probability in terms of frequentist or subjective views. All they do is define probability as any any measure P of uncertainty that satisfies axioms 1, 2 and 3.  


So, although most people think mathematics is hard, the mathematicians actually have it easy because they can avoid all the discussions here and just look at the three axioms.

And we can all benefit from the mathematical approach because what mathematicians do once they have a set of axioms is  they start to derive theorems that follow logically from the axioms.  One great example of such a theorem tells us how to calculate the probability of the complement of an event. For example, as we saw above that the complement of the event is "roll the number 1 on a die" is the event "roll a number not equal to 1 on a die". In this case it is easy to see that the probability of the complement is 5/6. That means the probability of the complement is one minus the probability of the event. In fact, this is true of any event:

 
A probability theorem: The probability of the complement of an event is equal to one minus the probability of the event.

 (mathematicians write this as: "P(E) = 1 - P(not E)"


This theorem can be proved from the axioms as follows:

  1. Any event E and its complement (not E) are mutually excluse.
  2. From point 1 and axiom 3 the probability of the combined event "E or not E" is equal to the probability of E plus the probability of not E
  3. But the event "E or not E" is certain, therefore its probability (from axiom 2) is equal to one.
  4. From 2 and 3 we conclude that the probability of E plus the probability of not E is equal to one.
  5. From 4 we conclude that the probability of not E is equal to one minus the probability of E.

Mathematicians would write the above proof more concisely as:

  1. For all E, events E and not E mutually exclusive
  2. P(E or not E) = P(E) + P(not E)       (from 1 and axiom 3)
  3. P(E or not E) = 1                              (axiom 2)
  4. 1 = P(E) + P(not E)                          (from 2 and 3)
  5. P(not E) = 1 - P(E)                           (from 4)



Norman Fenton


Return to Main Page Making Sense of Probability: Fallacies, Myths and Puzzles