Mean and standard deviation of RVs

Random variables take a random value after each execution of an experiment. For example, if we flip a coin three times, and NN is the random variable representing the number of heads, then after each execution of the experiment (three flips), NN will take on one of the numbers 0,1,20,1,2 or 33. If we throw extremely often, we get a data series, e.g.

0,0,1,3,0,2,2,2,1,3,0,3,3,2,...0,0,1,3,0,2,2,2,1,3,0,3,3,2,...

What is the mean of this data series, and what is the standard deviation? These measures are interesting because they help to further characterize the random variable NN. For example, do the mean and standard deviation change if the coin is biased, and if so, how exactly?

We will now show how to calculate the mean and standard deviation of a random variable using a (different, simpler) example. As you will see, we will require the probability function of the random variable.

Example 1

Consider a box filled with melons of different size, each costing 1Fr1 Fr, 2Fr2 Fr or 5Fr5 Fr. The random experiment is to randomly select a melon, and we define the random variable CC="cost of melon". Assume that the probability function of CC is as follows:

p(C=1)=0.4p(C=2)=0.35p(C=5)=0.25\begin{array}{lll} p(C=1)&=&0.4\\ p(C=2)&=&0.35\\ p(C=5)&=&0.25 \end{array}

So, every time we perform the experiment, CC will have a different value, either C=1C=1 or C=2C=2 or C=5C=5. If we repeat this experiment a huge number of times, how much do we have to pay on average for a melon?

Well, the percentage of times we select a 1Fr1 Fr melon is given by the probability p(C=1)p(C=1), and similar, p(C=2)p(C=2) is the percentage of times we select a 2Fr2 Fr melon, and p(C=5)p(C=5) is the percentage of time we select a 5Fr5 Fr melon. So we have (see previous chapter)

μ=p(C=1)1Fr+p(C=2)2Fr+p(C=5)5Fr=0.41Fr+0.352Fr+0.255Fr=2.35Fr\begin{array}{lll} \mu&=&p(C=1)\cdot 1 Fr + p(C=2)\cdot 2 Fr + p(C=5)\cdot 5 Fr\\ &=& 0.4\cdot 1 Fr + 0.35\cdot 2 Fr + 0.25\cdot 5 Fr\\ &=& 2.35 Fr \end{array}

Note that we denote the mean of a random variable by μ\mu ("mu"). This is typical for the average of random variables. We can also calculate the standard deviation of all these costs (again, see previous section).

σ=p(C=1)(12.35)2+p(C=2)(22.35)2+p(C=5)(52.35)2=0.4(12.35)2+0.35(22.35)2+0.25(52.35)2=1.59Fr\begin{array}{lll} \sigma &=&\sqrt{p(C=1)\cdot (1-2.35)^2 + p(C=2)\cdot (2-2.35)^2 + p(C=5)\cdot (5-2.35)^2}\\ &=& \sqrt{0.4\cdot (1-2.35)^2 + 0.35\cdot (2-2.35)^2 + 0.25\cdot (5-2.35)^2}\\ &=& 1.59 Fr \end{array}

As you have seen, we use σ\sigma ("sigma") for the standard deviation of a random variable. So on average, the cost is 2.35Fr2.35 Fr per selected melon. The typical deviation per selection from this value is 1.59Fr1.59 Fr. These values are only exact if the experiment is performed infinitely often, because otherwise the probabilities are just approximation of the percentages.

Let's summarise:

Summary 1

Consider a random variable XX of a random experiment, with probability function

p(X=x1),...p(X=xr)p(X=x_1), ... p(X=x_r)

Every time we perform the experiment, XX will take on one of the values x1,...,xrx_1,...,x_r. If we perform the experiment many times (infinitely often), we can calculate the mean and the standard deviation of these values:

μX=p(X=x1)x1+...+p(X=xr)xr\mu_X =p(X=x_1)\cdot x_1+...+p(X=x_r)\cdot x_r

and

σX=p(X=x1)(x1μ)2+...+p(X=xr)(xrμ)2\sigma_X =\sqrt{p(X=x_1)\cdot(x_1-\mu)^2+...+p(X=x_r)\cdot (x_r-\mu)^2}

The mean is also called the expected value of X, and is denoted by μX\mu_X or E[X]E[X]. The standard deviation of X is also denoted by σX\sigma_X. The variance of X is the square of the standard deviation and is denoted by σX2\sigma_X^2 or Var[X]Var[X].

Interpretation of μX\mu_X and σX\sigma_X: μX\mu_X is the (long-run) average value of XX per experiment, and the typical deviation per experiment from this average is σX\sigma_X.

Exercise 1

A fair coin is flipped twice. For two heads you get 2Fr2 Fr, for one head you get 1Fr1 Fr and for zero heads you get nothing. What is your average win per game? And what is the typical deviation per game from this average?

Solution

We We introduce the random variable WW="win in Fr". The probability function of WW is

p(W=2)=p(HH)=14p(W=1)=p(HT,TH)=24p(W=0)=p(TT)=14\begin{array}{lll} p(W=2)&=&p({HH})=\frac{1}{4}\\ p(W=1)&=&p({HT, TH})=\frac{2}{4}\\ p(W=0)&=&p({TT})=\frac{1}{4}\\ \end{array}

The average win of WW is

μW=142Fr+241Fr+140Fr=1Fr\mu_W = \frac{1}{4}\cdot 2 Fr + \frac{2}{4}\cdot 1 Fr + \frac{1}{4}\cdot 0 Fr = \underline{1 Fr}

and the standard deviation of this average is

σW=14(21)2+24(11)2+14(01)2Fr=0.5=0.707Fr\begin{array}{lll} \sigma_W &=& \sqrt{\frac{1}{4}\cdot (2-1)^2 + \frac{2}{4}\cdot (1-1)^2 + \frac{1}{4}\cdot (0-1)^2 Fr}\\ &=& \sqrt{0.5}\\ &=& \underline{0.707 Fr} \end{array}

In the next exercise, it is also possible to lose money in the long run, which is indicated by a negative gain.

Exercise 2

A fair coin is flipped twice. For two heads you get 2Fr2 Fr, for one head you get 1Fr1 Fr and for zero heads you have to pay 5Fr5 Fr. What is your average win per game?

Solution

We introduce the random variable WW="win in Fr". The probability function of WW is

p(W=2)=p(HH)=14p(W=1)=p(HT,TH)=24p(W=5)=p(TT)=14\begin{array}{lll} p(W=2)&=&p({HH})=\frac{1}{4}\\ p(W=1)&=&p({HT, TH})=\frac{2}{4}\\ p(W=-5)&=&p({TT})=\frac{1}{4}\\ \end{array}

The average win is

μW=142Fr+241Fr+14(5)Fr=0.25Fr\mu_W = \frac{1}{4}\cdot 2 Fr + \frac{2}{4}\cdot 1 Fr + \frac{1}{4}\cdot (-5) Fr = \underline{-0.25 Fr}

In other words, you lose on average 0.25Fr0.25 Fr.

In a game of chance we typically have an entry fee, and at the end some payout. Your win is the payout minus the entry fee. We say that at a game is fair, if the average win per game is zero. Here is an example.

Exercise 3

In a game of chance a fair die is rolled twice. The entry fee is 3Fr3 Fr. The payout is as follows: you get 13Fr13 Fr for a double six, 9Fr9 Fr for one 66 and nothing otherwise. Is this a fair game?

Solution

We introduce the random variable WW="your win in Fr". The possible values of WW are 133=1013-3=10 for a double 66, 93=69-3=6 for one 66, and 03=30-3=-3 otherwise. The probability function of WW is

p(W=10)=p({66})=136p(W=6)=p({61,62,63,64,65,16,26,36,46,56})=1036p(W=3)=2536\begin{array}{lll} p(W=10) &=& p(\{66\})=\frac{1}{36}\\ p(W=6) &=& p(\{61,62,63,64,65,16,26,36,46,56\})=\frac{10}{36}\\ p(W=-3) &=& \frac{25}{36} \end{array}

Thus, we have

μW=13610Fr+10366Fr+2536(3)Fr=536\mu_W = \frac{1}{36}\cdot 10 Fr + \frac{10}{36}\cdot 6 Fr + \frac{25}{36}\cdot (-3) Fr = -\frac{5}{36}

Thus, this is not a fair game. On average, you will loose 0.14Fr0.14 Fr per game.

Here is an example of an average which is not money related.

Exercise 4

The random experiment is to roll a fair die twice and form the sum. What is the average value you get per experiment?

Solution

We define SS="sum of the two numbers". We already know from an earlier exercise what the probability function of SS is, and therefore get

μS=1362+2363+3364+4365+5366+6367+......+5368+4369+33610+23611+13612=25236=7\begin{array}{lll} \mu_S &=& \frac{1}{36}\cdot 2 + \frac{2}{36}\cdot 3 + \frac{3}{36}\cdot 4 +\frac{4}{36}\cdot 5 +\frac{5}{36}\cdot 6 +\frac{6}{36}\cdot 7 + ... \\ & & ... +\frac{5}{36}\cdot 8 +\frac{4}{36}\cdot 9 +\frac{3}{36}\cdot 10 +\frac{2}{36}\cdot 11 + \frac{1}{36}\cdot 12\\ & & = \frac{252}{36}=\underline{7} \end{array}