The probability of an outcome

Consider the random experiment "tossing a coin". If we perform the experiment, the outcome $H$ ("head") will occur with a certain likelihood. We would like to quantify this likelihood with a number between $0$ and $1$ , where bigger means more likely. We will call this number the probability of outcome H, denoted by

p(H)

This works as follows. Repeat the experiment under the exact same conditions $N$ times. Count how often the outcome $H$ occurs, and let's denote this number by $n$ . The relative frequency of occurrences of $H$ is defined as

\frac{n}{N}

In essence, this will be our probability $p(H)$ . Indeed, this number is between $0$ and $1$ , and the bigger the relative frequency is, the more has the outcome $H$ occurred. For example, if $H$ were to occur every time, the relative frequency is $n/N=N/N=1$ . If outcome $H$ never occurs, the relative frequency is $0/N=0$ .

But note that we cannot simply set $p(H)=n/N$ , because this is not well defined. Why? Well, every time we attempt to determine the value $n/N$ by performing the experiment $N$ times, the value will vary! This is demonstrated in the exercise below.

Exercise 1

We consider the random experiment "flipping a coin once". Repeat the experiment $N=20$ times, and determine the relative frequency of $H$ .

Then Determine the relative frequency of $H$ again, using the same procedure. Observe that the two relative frequencies are different.

How can we avoid these fluctuations of the relative frequency $n/N$ ? The next exercise offers a solution.

Exercise 2

We consider the random experiment "flipping a coin once". Repeat the experiment $N=10$ times and count the number of times that head $H$ occurs. Determine the relative frequency of head, $n/N$ .

Repeat the experiment another $10$ times, so in total we have $N=20$ repetitions. Determine the total number of occurrences of head, and again calculate the relative frequency $n/N$ .

Now continue this procedure by always flipping another $10$ times, and fill out the table shown below:

\begin{array}{|c|l|l|l|l|l|l|l|l|l|l|l|}\hline N & 10 & 20 & 30 & 40 & 50 & 60 & 70 & 80 & 90 & 100 \\\hline n & & & & & & & & & & \\\hline \frac{n}{N} & & & & & & & & & & \\\hline \end{array}

Also, indicate the calculated relative frequencies $n/N$ as a function of $N$ in a coordinate system ( $N$ along the $x$ -axis, $n/N$ along the $y$ -axis).

What you should observe is that the with higher values of repetitions $N$ , the relative frequency $n/N$ stabilises and approaches a specific value. We define this value as the probability of outcome $o$ :

Definition 1

Consider an experiment, and denote one outcome by $o$ . The probability of outcome o is defined as the long-run relative frequency of $o$ :

p(o)=\frac{n}{N}\quad (N \text{ large})

where $N$ is the number of repetitions of the experiment, and $n$ is the number of experiments in which $o$ occurred.

The term "long-run" refers to the fact that $N$ has to be very, very large (we choose $N$ so big that the fluctuations in $n/N$ become negligible).

Note that the relative frequency of $o$ can also be expressed as a percentage (of the repetitions $N$ ). For this reason, we can also express the probability as a percentage. For example, $p(o)=0.2$ can also be expressed as $p(o)=20\%$ . We will use both notations.

Thus we can also rephrase the definition of the probability using percentages.

Note 1

Repeating the experiment $N$ times, where $N$ is a big number, then $p(o)$ is the percentage of times that outcome $o$ occurred.

Exercise 3

For a die it is $p(6)=1/6$ . You roll the die $12\,000$ times. What is the number of $6$ 's you can expect to observe? Is this number accurate?

Solution

As $p(6)\approx\frac{n}{12\,000}=\frac{1}{6}$ it follows $n\approx\frac{12\,000}{6}=2000$ . This is just an estimate and will fluctuate. But as $N$ is quite large, the fluctuations of $n$ will be quiet small.

Here is our first theorem about probabilities. The proof is given as an exercise.

Theorem 1

The sum of all outcome probabilities of a random experiment equals $1$ . That is, if $o_1,o_2,...,o_m$ are the possible outcomes of the experiment, then

\sum_{i=1}^m p(o_i) = p(o_1)+p(o_2)+...+p(o_m)=1

Exercise 4

Give a proof of the statement above.

Solution

Repeat the experiment $N$ times, where $N$ is a very large number. By definition of the probability, $p(o_i)$ is the percentage of times that outcome $o_i$ occurs. Adding the percentages for every outcome $o_i$ , we must get $100\%$ . This is so because every repetition of the random experiment results in exactly one of the outcomes $o_1,o_2,...,o_m$ .