The probability of an outcome
Consider the random experiment "tossing a coin". If we perform the experiment, the outcome ("head") will occur with a certain likelihood. We would like to quantify this likelihood with a number between and , where bigger means more likely. We will call this number the probability of outcome H, denoted by
This works as follows. Repeat the experiment under the exact same conditions times. Count how often the outcome occurs, and let's denote this number by . The relative frequency of occurrences of is defined as
In essence, this will be our probability . Indeed, this number is between and , and the bigger the relative frequency is, the more has the outcome occurred. For example, if were to occur every time, the relative frequency is . If outcome never occurs, the relative frequency is .
But note that we cannot simply set , because this is not well defined. Why? Well, every time we attempt to determine the value by performing the experiment times, the value will vary! This is demonstrated in the exercise below.
We consider the random experiment "flipping a coin once". Repeat the experiment times, and determine the relative frequency of .
Then Determine the relative frequency of again, using the same procedure. Observe that the two relative frequencies are different.
How can we avoid these fluctuations of the relative frequency ? The next exercise offers a solution.
We consider the random experiment "flipping a coin once". Repeat the experiment times and count the number of times that head occurs. Determine the relative frequency of head, .
Repeat the experiment another times, so in total we have repetitions. Determine the total number of occurrences of head, and again calculate the relative frequency .
Now continue this procedure by always flipping another times, and fill out the table shown below:
Also, indicate the calculated relative frequencies as a function of in a coordinate system ( along the -axis, along the -axis).
What you should observe is that the with higher values of repetitions , the relative frequency stabilises and approaches a specific value. We define this value as the probability of outcome :
Consider an experiment, and denote one outcome by . The probability of outcome o is defined as the long-run relative frequency of :
where is the number of repetitions of the experiment, and is the number of experiments in which occurred.
The term "long-run" refers to the fact that has to be very, very large (we choose so big that the fluctuations in become negligible).
Note that the relative frequency of can also be expressed as a percentage (of the repetitions ). For this reason, we can also express the probability as a percentage. For example, can also be expressed as . We will use both notations.
Thus we can also rephrase the definition of the probability using percentages.
Repeating the experiment times, where is a big number, then is the percentage of times that outcome occurred.
For a die it is . You roll the die times. What is the number of 's you can expect to observe? Is this number accurate?
Solution
As it follows . This is just an estimate and will fluctuate. But as is quite large, the fluctuations of will be quiet small.
Here is our first theorem about probabilities. The proof is given as an exercise.
The sum of all outcome probabilities of a random experiment equals . That is, if are the possible outcomes of the experiment, then
Give a proof of the statement above.
Solution
Repeat the experiment times, where is a very large number. By definition of the probability, is the percentage of times that outcome occurs. Adding the percentages for every outcome , we must get . This is so because every repetition of the random experiment results in exactly one of the outcomes .