Probability

A. Random experiments, probability of outcomes and events, Laplace experiments

Consider a random experiment "a die is rolled once"

  1. Explain, what is a random experiment?
  2. Explain, what is the sample space of an experiment?
  3. Explain, what is an outcome oo and what is an event EE of an experiment?
  4. What does it it mean, that an event "occurs"? Explain with the help of the outcomes in this event.
  5. How do you determine experimentally the probability for an outcome oo to occur?
  6. How do we determine experimentally the probability for an event EE to occur?
  7. Express the probability of an event with the outcome probabilities of the experiment.
  8. Determine p(S)p(S) and p({})p(\{\}) and give an intuitive explanation.
  9. What number do you get if you add all outcome probabilities of a random experiment?
  10. What is a Laplace experiment?
  11. Why does the probability of an outcome in a Laplace experiment only depend on the number of possible outcomes? And what is this probability?
  12. For a Laplace experiment we can determine the probability of an event with the formula p(E)=ES=winning outcomesall outcomesp(E)=\frac{|E|}{|S|}=\frac{\text{winning outcomes}}{\text{all outcomes}} Explain why this formula is correct.
  13. A fair die is rolled twice. Determine the probability for each outcome, and for the event "the sum of the two numbers is even".
Show
Solutions A
  1. A random experiment has several possible outcomes. Each time you perform the experiment, exactly one of these outcomes occurs, but it is not possible to predict with 100% certainty which one it will be.
  2. The sample space is the set of all possible outcomes, S={1,2,3,4,5,6}S=\{1,2,3,4,5,6\}.
  3. An outcome oo is any element oSo\in S of this set, e.g. o=1o=1 ("die shows a 1"). An event is any subset ESE\subset S of the sample space, e.g. E={2,4,6}E=\{2,4,6\} ("die shows an even number"). The sample space SS is also an event.
  4. An event EE occurs if one of the outcomes in EE occurs. For example, the event E={2,4,6}E=\{2,4,6\}="die shows even number" occurs if one of the outcomes 2,42, 4, or 66 occurs.
  5. The probability of an outcome is the percentage of times for the outcome to occur in the long-run (that is, after many, many repetitions of the experiment). For example, if you roll a die N=100000N=100\,000 times, and the "" occurs n=10000n=10\,000 times, then the probability for "1" is approximately p("1")nN=0.1p("1")\approx\frac{n}{N}=0.1 or 10%\approx 10\%. The more often you repeat the experiment (the bigger NN), the more accurate is the estimate of the probability.
  6. Similar to an outcome, the probability of an event is percentage of times for the event to occur in the long run (that is, after many, many repetitions of the experiment). For example, repeating the experiment "rolling a die once" N=100000N=100\,000 times, and the event EE="even number" occurs n=60000n=60\,000 times, then the probability of this event is approximately p(E)nN=0.6p(E)\approx \frac{n}{N}=0.6 or 60%\approx 60\%. The more often we repeat the experiment, the better is the approximation to the probability.
  7. As EE occurs if either 22, 44, or 66 occurs, we simply have to add the percentages of times the 22, the 44, or the 66 occurs, that is, we have p(E)=p("2")+p("4")+p("6")p(E)=p("2")+p("4")+p("6"). More generally, for an event E={o1,o2,o3,o4}E=\{o_1, o_2, o_3,o_4\}, we have p(E)=p(o1)+p(o2)+p(o3)+p(o4)p(E)=p(o_1)+p(o_2)+p(o_3)+p(o_4).
  8. p(S)p(S) is the probability of the event "any outcome of the experiment occurs". And because an outcome always occurs, it is p(S)=1p(S)=1. SS is also called the sure event. p({})p(\{\}) is the probability of the event "no outcome occurs". It follows p({})=0p(\{\})=0. The event {}\{\} is also called the impossible event.
  9. We have seen that p(S)=1p(S)=1 and p(S)=p(o1)+...+p(on)p(S)=p(o_1)+...+p(o_n) (that's true for any event), thus we get p(o1)+...+p(on)=1p(o_1)+...+p(o_n)=1.
  10. In a Laplace experiment, all nn outcomes o1,...,ono_1,...,o_n must have the same probability pp: p=p(o1)=...=p(on)p=p(o_1)=... =p(o_n). In other words, the experiment is "fair".
  11. Because all outcomes have the same probability, and the sum of all probabilities has to be 11, p(o1)+...+p(on)=p+...+p=np=1p=1/np(o_1)+...+p(o_n)=p+...+p=n\cdot p =1 \rightarrow p=1/n and we see that the outcome probability is inversely related to the number of outcomes of the experiment.
  12. Assume there are nn possible outcomes, that is, S=n|S|=n, and that EE contains mm of these outcomes, E={o1,o2,...,om}E=\{o_1, o_2, ..., o_m\}, and thus E=m|E|=m. As it is a Laplace experiment, all outcome probabilities are the same, p=1/np=1/n. It follows p(E)=p(o1)+...+p(om)=1n+...+1nm of those=mn=ESp(E)=p(o_1)+...+p(o_m)=\underbrace{\frac{1}{n}+...+\frac{1}{n}}_{m \text{ of those}}=\frac{m}{n}=\frac{|E|}{|S|}
  13. The table shows all the possible sums of the two numbers: add123456123456723456783456789456789105678910116789101112\begin{array}{c|ccccccc} add & 1 & 2 & 3 & 4 & 5 & 6\\\hline 1 & 2 & 3 & 4 & 5 & 6 & 7\\ 2 & 3 & 4 & 5 & 6 & 7 & 8\\ 3 & 4 & 5 & 6 & 7 & 8 & 9\\ 4 & 5 & 6 & 7 & 8 & 9 & 10\\ 5 & 6 & 7 & 8 & 9 & 10 & 11\\ 6 & 7 & 8 & 9 & 10 & 11 & 12\\ \end{array} Each outcome such as "43" (meaning roll 1 is a "4" and roll 2 is a "3", so the sum is 7) has the same probability to occur. Thus, the event EE=" sum is even" is E={11,13,15,22,24,26,31,33,35,42,44,46,51,53,55,62,64,66}E=\{11,13,15,22,24,26,31,33,35,42,44,46,51,53,55,62,64,66\} and therefore has probability p(E)=ES=1836=0.5p(E)=\frac{|E|}{|S|}=\frac{18}{36}=\underline{0.5}
B. Conditional probability, independence, Venn-Diagrams, probability trees
  1. For two events AA and BB, what is the conditional probability p(AB)p(A|B)? Explain.
  2. Consider two events AA and BB of a random experiment. Represent the events in a Venn-Diagram and show the following:
    1. p(Ac)=1p(A)p(A^c)=1-p(A)
    2. p(AAc)=0p(A\cap A^c)=0
    3. p(AAc)=1p(A\cup A^c)=1
    4. p(AB)=p(A)+p(B)p(AB)p(A\cup B)=p(A)+p(B)-p(A\cap B)
    5. p(ABc)=p(A)p(AB)p(A\cap B^c)=p(A)-p(A\cap B)
    6. p(AcB)=1p(AB)p(A^c|B)=1-p(A|B)
  3. Consider two events AA and BB with p(A)=0.25p(A)=0.25, p(BA)=0.7p(B|A)=0.7, and p(BAc)=0.4p(B|A^c)=0.4 Draw the probability tree (starting with AA) and indicate each branch probability. Use these branch probabilities to calculate p(AB)p(A\cap B), p(AcB)p(A^c\cap B), p(ABc)p(A\cap B^c), p(AcBc)p(A^c\cap B^c), p(B)p(B), and p(AB)p(A | B).
  4. Is it always true that p(AB)=p(A)p(B)p(A\cap B)=p(A)\cdot p(B)? If not, when is it true. Also, what is the general formula for p(AB)p(A\cap B)?
  5. The two events AA and BB are mutually exclusive. What does this mean? And are the two events independent?
  6. RR is the event "it rains" today, and AA is the event "there are accidents" today. Assume now that p(R)=0.2p(R)=0.2 and p(A)=0.3p(A)=0.3, and p(RA)=0.1p(R\cap A)=0.1. Draw the probability tree and indicate the probabilities. With the help of the tree determine the probabilities for the events
    1. it does not rain.
    2. it rains or there are accidents.
    3. it does not rain and there are accidents.
    4. it does not rain and there are no accidents.
    5. it rains given that there are accidents.
    6. there are accidents given that it rains.
  7. A study determines that on an island, 40%40\% of the inhabitants are male, and of these, 5%5\% are color blind. If you randomly select a person from the island, what is the probability that the person is a color blind male? Draw the corresponding probability tree.
  8. A study determines that on an island, 40%40\% of the inhabitants are male, and 60%60\% are female. 5%5\% of the male inhabitants are color blind, for the females it is just 1%1\%. If you randomly select from the color blind inhabitants, what is the probability that the person is male?
  9. A box contains four balls labelled from 00 to 33. Two balls are selected at random and with replacement. Are the events "sum is even" and "product is >0>0" independent? (00 is an even number).
Show
Solutions B
  1. p(AB)p(A|B) is called the conditional probability of AA given BB. It has several interpretations, which are illustrated with the following example. Given a forest, and the events AA = "a branch in the forest snaps" and BB = "a person sits on the tree". Here are three interpretations of p(AB)p(A|B):
    1. p(AB)p(A|B) is the probability that a branch snaps knowing that a person sits on the tree. If, for example, p(AB)=0.8p(A|B)=0.8 then p(A)p(A) will normally be different, e.g. p(A)=0.1p(A)=0.1. (Only if AA and BB are independent is p(AB)=p(A)p(A|B)=p(A)).
    2. The sample space SS contains all the trees in the forest, and p(A)p(A) is the percentage of all these trees with a snapped branch. p(AB)p(A|B) is the percentage of trees with a snapped tree relative to the reduced sample space BB, that is, p(AB)p(A|B) is the percentage of trees with a snapped tree relative to all the trees with a person sitting on it.
    3. Percentage of a percentage. Say 20%20\% of the trees in the forest are occupied by people (event BB), and of these, 80%80\% have a snapped tree. So we have p(B)=0.2p(B)=0.2, and p(AB)=0.8p(A|B)=0.8. From the last example we see that the percentage of trees in the forest with a snapped tree and a person sitting on it is 20%20\% of 80%80\%, which is 16%16\% (0.80.20.8\cdot 0.2). In other words, we have the formula p(AB)=p(AB)p(B)p(A\cap B)=p(A|B)\cdot p(B) or p(AB)=p(AB)p(B)p(A|B)=\frac{p(A\cap B)}{p(B)}
  2. No, p(AB)=p(A)p(B)p(A\cap B)=p(A)\cdot p(B) is only correct if AA and BB are independent. Actually, this equation is one definition of independence. Another one is that p(AB)=p(A)p(A|B)=p(A) (which means that the occurrence of event BB has no influence on the probability of event AA, which makes actually sense if AA and BB are independent. Or another one is that p(BA)=p(B)p(B|A)=p(B). Also, intuitively it should be clear that if AA and BB are independent, so are the two events AA and BcB^c, AcA^c and BB, and AcA^c and BcB^c.
  3. AA and BB are mutually exclusive if the can not occur together in the same experiment. This is only possible if they have no outcome in common, such AB={}A\cap B=\{\}, and thus p(AB)=0p(A\cap B)=0. It also follows that AA and BB are not independent (they are dependent), because once I know that BB has occurred, I know that AA cannot occur, so clearly p(AB)=0p(A)\underbrace{p(A|B)}_{=0}\neq p(A) (with the exception of p(A)=0p(A)=0).
  4. You can use the Venn-Diagram or the tree. I recommend the tree ... . MM="male, F=McF=M^c="female", CBCB="color blind".
  5. MM="male, F=McF=M^c="female", CBCB="color blind".
  6. EE="sum is even", MM="product >0". See figure below. Method 1: check if p(EM)=p(E)p(M)p(E\cap M)=p(E)\cdot p(M). p(EM)=EMS=516p(E)=ES=816p(M)=MS=916\begin{array}{lll} p(E\cap M) & = \frac{|E \cap M|}{|S|}=\frac{5}{16}\\ p(E) & =\frac{|E|}{|S|}=\frac{8}{16}\\ p(M) & =\frac{|M|}{|S|}=\frac{9}{16} \end{array} And we see that EE and MM are dependent, because 816916516\frac{8}{16} \cdot \frac{9}{16}\neq \frac{5}{16}. Method 2: check if p(EM)=p(E)p(E|M)=p(E) (or p(ME)=p(M)p(M|E)=p(M)). p(EM)="percentage of M that are E"=EMM=59p(E)=ES=816\begin{array}{lll} p(E|M) & =\text{"percentage of $M$ that are $E$"}=\frac{|E\cap M|}{|M|}=\frac{5}{9}\\ p(E)=& \frac{|E|}{|S|}=\frac{8}{16} \end{array} And again we see that the events are depend, as 59816\frac{5}{9}\neq \frac{8}{16}
C. Combinatorics, independent and dependent repetitions, binomial experiments
  1. When do you need
    1. n!n! (nn factorial)
    2. the MISSISSIPPI-problem
    3. the binomial coefficient
  2. Calculate and also determine intuitively
    1. (n0)\left(\begin{array}{c}n\\0\end{array}\right)
    2. (nn)\left(\begin{array}{c}n\\n\end{array}\right)
    3. (n1)\left(\begin{array}{c}n\\1\end{array}\right)
  3. A box contains 1010 balls. Three balls have a weight of 1g1g, five balls have a weight of 2g2g and the rest of the balls have a weight of 5g5g.
    1. Two balls are drawn one after the other, with replacement. Represent the experiment with a probability tree and indicate the probabilities. Also draw the probability tree for the case in which the drawn ball is not replaced.
    2. For both cases (with or without replacement), determine the probability that the two balls drawn will have a total weight of 6g6g.
    3. SKIP! Determine the average total weight per Experiment (drawing with replacement).
    4. For which method (with or without replacement) are the events A1A_1="1g-ball drawn in first draw" and A2A_2="1g-ball drawn in second draw" independent?
    5. If three balls are drawn without replacement, what is then the probability to draw a total weight of 12g12 g?
  4. A coin with p(H)=0.2p(H)=0.2 and p(T)=0.8p(T)=0.8 is flipped 44 times.
    1. Draw the probability tree that represents this experiment. Indicate the branch probabilities.
    2. Determine p(HTHT)p(HTHT), and justify why your calculation is correct.
    3. Determine the probabilities p(HHTT)p(HHTT), p(TTHH),p(THTH),p(THHT)p(TTHH), p(THTH), p(THHT), and p(HTTH)p(HTTH).
    4. Based on the result above, what is the probability to observe exactly 22 heads and 22 tails?
    5. What type of experiment is this? Explain.
  5. A box contains red, green and black balls. You draw with replacement. In 30%30\% of the cases you draw red, in 20%20\% of the cases you draw green, and in 50%50\% of the cases you draw black.
    1. If you draw twice, what is the probability to draw one red and one green ball?
    2. If you draw six times, what is the probability to draw 44 red balls?
  6. A fair die is rolled ten times. Determine the probability to observe
    1. five "6" followed by five other numbers.
    2. exactly three "6".
    3. more than three "6".
    4. more than three times but less than seven times shows the die a number bigger than 44.
    5. How often do you have to roll the die so that the probability for observing at least one "6" is 0.9999990.999999 (or very close to it).
Show
Solutions C
  1. We have the following definitions:
    1. nn-factorial n!=n(n1)(n2)...1n!=n\cdot (n-1)\cdot (n-2)\cdot ...\cdot 1 calculates the number of permutation or arrangements of a word with nn different letters. For example, the word REAL can be arranged in 4!=4321=244!=4\cdot 3\cdot 2\cdot 1 = 24 different ways.
    2. If some of the characters are the same, we have the MISSISSIPPI-problem. The number of unique arrangements of the word MISSISSIPPI is 11!4!4!2!=34650\frac{11!}{4! \cdot 4!\cdot 2!}=34\,650
    3. The binomial coefficient (nk)\left(\begin{array}{c}n\\k\end{array}\right) ("n choose k") denotes the number arrangements of a word with nn letters contains kk H and nkn-k T. For example, the number of arrangements of the word HHHHTTTHTTHH is denoted by (127)\left(\begin{array}{c}12\\7\end{array}\right) Using the MISSISSIPPI-formula, we get (nk)=n!k!(nk)!\left(\begin{array}{c}n\\k\end{array}\right)=\frac{n!}{k!\cdot (n-k)!}
  2. We have
    1. (n0)=1\left(\begin{array}{c}n\\0\end{array}\right)=1, because there is only one arrangement of the word HHHHHH....H.
    2. (nn)=1\left(\begin{array}{c}n\\n\end{array}\right)=1, because there is only one arrangement of the word TTTTTT....T.
    3. (n1)=n\left(\begin{array}{c}n\\1\end{array}\right)=n, because there are nn arrangements of the word HTTTTT....T (you can shift the H to any of the nn positions in the word).
  3. We have
    1. All paths leading to 6g are indicated in the figure above (top row, yellow). We have to add them:
      1. with replacement: p(6g)=310210+210310=325p(6g)=\frac{3}{10}\cdot \frac{2}{10}+\frac{2}{10}\cdot \frac{3}{10}=\frac{3}{25}
      2. without replacement: p(6g)=31029+21039=215p(6g)=\frac{3}{10}\cdot \frac{2}{9}+\frac{2}{10}\cdot \frac{3}{9}=\frac{2}{15}
    2. Let XX be the total weight. p(X=2)=310310=9100p(X=3)=310510+510310=30100p(X=4)=510510=25100p(X=6)=310210+210310=12100p(X=7)=510210+210510=20100p(X=10)=210210=4100\begin{array}{lllll} p(X=2)&=&\frac{3}{10}\cdot \frac{3}{10}&=&\frac{9}{100}\\[0.3em] p(X=3)&=&\frac{3}{10}\cdot\frac{5}{10}+\frac{5}{10}\cdot \frac{3}{10}&=&\frac{30}{100}\\[0.3em] p(X=4)&=&\frac{5}{10}\cdot \frac{5}{10}&=&\frac{25}{100}\\[0.3em] p(X=6)&=&\frac{3}{10}\cdot\frac{2}{10}+\frac{2}{10}\cdot \frac{3}{10}&=&\frac{12}{100}\\[0.3em] p(X=7)&=&\frac{5}{10}\cdot\frac{2}{10}+\frac{2}{10}\cdot \frac{5}{10}&=&\frac{20}{100}\\[0.3em] p(X=10)&=&\frac{2}{10}\cdot\frac{2}{10}&=&\frac{4}{100} \end{array} μ=91002+301003+251004+121006+201007+410010=4.6\begin{array}{lll} \mu &=& \frac{9}{100}\cdot 2 + \frac{30}{100}\cdot 3+ \frac{25}{100}\cdot 4 + \frac{12}{100}\cdot 6 + \frac{20}{100}\cdot 7 + \frac{4}{100}\cdot 10\\ &=& 4.6 \end{array}
    3. With replacement, as it should be intuitively clear. To show it formally, we have to show, for example, that p(A2A1)=p(A2)p(A_2|A_1)=p(A_2). With replacement, we get p(A2A1)=310p(A_2|A_1)=\frac{3}{10} (see tree above) and p(A2)=310310+510310+210310=310p(A_2)=\frac{3}{10}\cdot\frac{3}{10}+\frac{5}{10}\cdot\frac{3}{10}+\frac{2}{10}\cdot\frac{3}{10}=\frac{3}{10} as well. So A1A_1 and A2A_2 are independent. We could have shown as well that p(A1A2)=p(A1)p(A_1|A_2)=p(A_1) or that p(AA2)=p(A1)p(A2)p(A\cap A_2)=p(A_1)\cdot p(A_2). They are all tests for independence. Now let's look at the case where we do not replace the balls. We get p(A2A1)=29p(A_2|A_1)=\frac{2}{9} (see tree above) and p(A2)=31029+51039+21039=31029p(A_2)=\frac{3}{10}\cdot\frac{2}{9}+\frac{5}{10}\cdot\frac{3}{9}+\frac{2}{10}\cdot\frac{3}{9}=\frac{3}{10}\neq\frac{2}{9} Thus, A1A_1 and A2A_2 are dependent.
    4. See tree at the bottom, the relevant paths are in yellow: p(12g)=5102918+2105918+2101958=124p(12 g)=\frac{5}{10}\cdot \frac{2}{9}\cdot \frac{1}{8}+\frac{2}{10}\cdot \frac{5}{9}\cdot \frac{1}{8}+\frac{2}{10}\cdot \frac{1}{9}\cdot \frac{5}{8}=\frac{1}{24}
  4. We have:
    1. The tree is shown below. 2. With H1H_1="head in first flip", T2T_2="tail in second flip", H3H_3="head in third flip", T4T_4="tail in fourth flip", and because the events are independent, we have p(HTHT)=p(H1T2H3T4)=p(H1)p(T2)p(H3)p(T4)=0.220.82=0.0256\begin{array}{lll} p(HTHT)&=&p(H_1 \cap T_2 \cap H_3 \cap T4)\\ &=&p(H_1)\cdot p(T_2)\cdot p(H_3)\cdot p(T_4)\\ &=&0.2^2\cdot 0.8^2\\ &=& 0.0256 \end{array}
    2. With the same argument as above we get p(HHTT)=p(TTHH)=p(THTH)=p(THHT=p(HTTH)=0.220.82=0.0256p(HHTT)=p(TTHH)=p(THTH)=p(THHT=p(HTTH)=0.2^2\cdot 0.8^2=0.0256
    3. Let NN="number of heads. We have (see tree) p(N=2)=0.220.82+...+0.220.826=60.220.82p(N=2)=\underbrace{0.2^2\cdot 0.8^2+...+0.2^2\cdot 0.8^2}_{6}=6\cdot 0.2^2\cdot 0.8^2
    4. This is a binomial experiment, where the number of repetitions is n=4n=4, and the Bernoulli experiment is "flipping a coin", with success probability p=0.2p=0.2 (head). Thus, we have p(N=6)=(42)0.220.842=60.220.82p(N=6)=\left(\begin{array}{c}4\\2\end{array}\right)\cdot 0.2^2\cdot 0.8^{4-2}=6\cdot 0.2^2\cdot 0.8^2
  5. We have
    1. p=0.30.2+0.20.3=0.12p=0.3\cdot 0.2+0.2\cdot 0.3=0.12 (see tree below).
    2. Binomial experiment, where the Bernoulli experiment with success probability p=0.3p=0.3 (red) is repeated n=6n=6 times. Thus, p(4 red)=(64)0.340.764=150.340.72=0.059p(4\text{ red})=\left(\begin{array}{c}6\\4\end{array}\right)\cdot 0.3^4\cdot 0.7^{6-4}=15\cdot 0.3^4\cdot 0.7^2=0.059 Or use binompdf(6,0.3,4)=0.059\texttt{binompdf(6,0.3,4)}=0.059
  6. Binomial experiment with n=10n=10 repetition of the Bernoulli experiment with success probability p=16p=\frac{1}{6} (a six). Denote by NN the number of observed "6".
    1. p("66666xxxxx")=(16)5(56)5=5.16105p("66666xxxxx")=\left(\frac{1}{6}\right)^5\cdot \left(\frac{5}{6}\right)^5=5.16\cdot 10^{-5}
    2. p(N=3)=binompdf(10,1/6,3)=0.155p(N=3)=\texttt{binompdf(10,1/6,3)}=0.155
    3. p(N>3)=1p(N3)=1binomcdf(10,1/6,3)=0.0697p(N>3)=1-p(N\leq 3)=1-\texttt{binomcdf(10,1/6,3)}=0.0697
    4. p(S)=2/6=1/3p(S)=2/6=1/3 (probability to observe number >4>4). Thus p(3<N<7)=p(N6)p(N3)=binomcdf(10,1/3,6)binomcdf(10,1/3,3)=0.9800.559=0.421\begin{array}{lll} p(3<N<7)&=&p(N\leq6)-p(N\leq 3)\\ &=&\texttt{binomcdf(10,1/3,6)}-\texttt{binomcdf(10,1/3,3)}\\ &=&0.980-0.559\\ &=&0.421 \end{array}
    5. Find nn with p(N1)=0.999999p(N\geq 1)= 0.999999. Because of p(N1)=1p(N<1)=1p(N=0)p(N\geq 1)=1-p(N<1)=1-p(N=0) and p(N=0)=(n0)(16)0(56)n=(56)np(N=0)=\left(\begin{array}{c}n\\0\end{array}\right) \cdot \left(\frac{1}{6}\right)^0 \cdot \left(\frac{5}{6}\right)^n= \left(\frac{5}{6}\right)^n we have to solve the equation 1(56)n=0.999999(56)n=0.000001ln((56)n)=ln(0.000001)nln(5/6)=ln(0.000001)n=ln(0.000001)ln(5/6)=75.77\begin{array}{lll} 1-\left(\frac{5}{6}\right)^n & = 0.999999\\ \left(\frac{5}{6}\right)^n & = 0.000001\\ \ln(\left(\frac{5}{6}\right)^n) & = \ln(0.000001)\\ n\cdot \ln(5/6) & = \ln(0.000001)\\ n & =\frac{\ln(0.000001)}{\ln(5/6)}=75.77 \end{array} So n=76n=76.
D. Random variables and distributions, normal distribution
  1. What is a random variable? What types are there?

  2. You play a game which you can win with probability 0.70.7. If you win you get 33 CHF, otherwise you lose 55 CHF.

    1. WW="amount won". Is this a discrete or a continuous random variable? What is the probability distribution of WW?
    2. Determine the mean of WW. Is this a fair game, in the sense that on average you will neither win nor loose money?
    3. If you play the game, by how much does the typical amount of money you win per game deviate from this mean?
  3. You play a game with a biased coin (pH(H)=0.4,p(T)=0.6pH(H)=0.4, p(T)=0.6). The number of times you play is 2020.

    1. Let NN denote the number of heads. What is the probability distribution of NN?
    2. If head occurs, you win 11 CHF, otherwise you lose 0.50.5 CHF. How much will you win on average after these twenty games?
  4. What is a continuous probability distribution function? Explain.

  5. Consider a random variable XX with the continuous probability distribution function

    f(x)={axx[1,2]0x∉[1,2]f(x)=\begin{cases} \frac{a}{x} & x\in [1,2]\\0 & x\not\in [1,2] \end{cases}

    where aa is a constant.

    1. Find aa.
    2. Determine the probability p(1.6X1.9)p(1.6\leq X\leq 1.9)
    3. Determine the average and the standard deviation of XX.
  6. The table below shows the height (in cmcm) of pupil.

    class frequency15016020160170301701805018019020\begin{array}{c|c|c} \text{class } & \text{frequency}\\\hline 150-160 & 20 \\ 160-170 & 30 \\ 170-180 & 50 \\ 180-190 & 20 \\ \end{array}
    1. Determine and sketch the histogram of the heights.
    2. Sketch the corresponding normal distribution in the same coordinate system. The mean of the heights is 175cm175cm, the standard deviation is 10cm10cm.
    3. Also, based on the normal distribution, determine the probability that a randomly selected pupil has a height between 160cm160cm and 180cm180cm, and also the probability that the height is smaller than 160cm160cm.
Show
Solutions D
  1. A random variable XX is a function which maps every outcome of a sample space to a numerical value. If the numerical values are discrete (e.g. 1,2,3,...1, 2, 3, ...) we call the random variable discrete. If the numerical values can be any value in a given interval (e.g. measurement values), we call the random variable continuous.
  2. Random variable WW is the amount of money you win.
    1. WW is is discrete, as it can take on the values 5-5 and 33. The distribution of this random variable is p(W=3)=0.7,p(W=5)=0.3p(W=3)=0.7, p(W=-5)=0.3
    2. The average amount you win is μ=p(W=3)3+p(W=5)(5)=0.730.35=0.6\mu = p(W=3)\cdot 3 + p(W=-5)\cdot (-5)=0.7\cdot 3-0.3\cdot 5=0.6 So it is not a fair game, on average you win 0.60.6 CHF.
    3. The typical deviation from this mean is σ=p(W=3)(30.6)2+p(W=5)(50.6)2=0.72.42+0.3(5.6)2=3.67\begin{array}{lll} \sigma &=&\sqrt{p(W=3)(3-0.6)^2+p(W=-5)(-5-0.6)^2}\\ &=&\sqrt{0.7\cdot 2.4^2+0.3\cdot (-5.6)^2}\\ &=&3.67 \end{array}
  3. Let NN be the number of heads (this is a discrete random variable). It is a binomial experiment, where the Bernoulli experiment with success probability p=p(H)=0.4p=p(H)=0.4) is repeated n=20n=20 times.
    1. The probability distribution is p(N=k)=(n0)0.4k0.620kp(N=k)=\left(\begin{array}{c}n\\0\end{array}\right) \cdot 0.4^k \cdot 0.6^{20-k} where k=0,1,2,...,20k=0,1,2,...,20.
    2. The average number of heads is μ=np=200.4=8\mu = np=20\cdot 0.4=8. So on average you win 88 times, and lose 1212 times. Thus, the average win is 81+12(0.5)=28\cdot 1+ 12\cdot (-0.5) = 2
  4. Consider a continuous random variable XX. The probability distribution function of XX is the function fXf_X that emerges if we make the class width of the histogram of a continuous random variable XX (or continuous data) smaller and smaller. In particular, the area under the graph of fXf_X is the probability: p(aXb)=abfX(x)dxp(a\leq X\leq b)= \int_a^b f_X(x)\, dx
  5. The antiderivative of fX(x)=axf_X(x)=\frac{a}{x} is F(x)=aln(x)F(x)=a\cdot \ln(x).
    1. The area under the curve has to be 11 (p(1X2)=1p(1\leq X\leq 2)=1, because no other values are possible for XX). So we have 1=12axdx=F(2)F(1)=aln(2)aln(1)a=1ln(2)=1.4431=\int_1^2 \frac{a}{x}\, dx=F(2)-F(1)=a\ln(2)-a\ln(1)\rightarrow a=\frac{1}{\ln(2)}=1.443
    2. p(1.6X1.9)=1.61.91.443xdx=F(1.9)F(1.6)=0.247p(1.6\leq X\leq 1.9)=\int_{1.6}^{1.9} \frac{1.443}{x}\, dx = F(1.9)-F(1.6)=0.247
    3. μ=12x1.442xdx=121.442x0dx=1.443\mu=\int_{1}^{2} x\cdot \frac{1.442}{x}\, dx =\int_{1}^{2} 1.442 x^0\, dx = 1.443
    4. σ2=12(x1.443)21.443xdx=...=0.0826\sigma^2=\int_{1}^{2} (x-1.443)^2 \cdot \frac{1.443}{x}\, dx = ... = 0.0826, thus σ=0.0826=0.287\sigma=\sqrt{0.0826}=0.287
  6. Let XX denote the height of the pupils (continuous random variable).
    1. The class width is Δx=10\Delta x=10, the number of pupils is 120120. Thus, the relative frequency yiy_i and the density did_i are

      class ifreqyidi150160200.1660.0166160170300.250.025170180500.4160.0416180190200.1660.0166\begin{array}{c|c|c|c} \text{class } i & \text{freq} & y_i & d_i\\\hline 150-160 & 20 & 0.166 & 0.0166\\ 160-170 & 30 & 0.25 & 0.025\\ 170-180 & 50 & 0.416 & 0.0416\\ 180-190 & 20 & 0.166 & 0.0166\\ \end{array}

      The histogram is shown below.

    2. The normal distribution has the parameters μ=175\mu=175 and σ=10\sigma =10 (that is, f175,10f_{175,10}). The highest point is at y=0.4σ=0.410=0.04y=\frac{0.4}{\sigma}=\frac{0.4}{10}=0.04. The sketch of the graph is shown below.

    3. p(160X180)=160180f175,10(x)dx=0.6247p(160\leq X\leq 180)=\int_{160}^{180} f_{175,10}(x)\, dx = 0.6247 (use calculator normcdf to calculate integral).

      p(X160)=160f175,10(x)dx=0.0668p(X\leq 160)=\int_{-\infty}^{160} f_{175,10}(x)\, dx=0.0668 (again, use normcdf, instead of -\infty use 1000-1000 or so.)

E. Pokémon TCG

In the mobile game Pokémon TCG Pocket, you can open digital booster packs of 5 Pokémon cards. The rarest card in the "Mythical Island" theme pack is the Mew Ex Gold (only called Mew from here on).

Each time you open a pack, it is randomly assigned to be either a regular pack (99.95%) or a rare pack (0.05%). The five cards in the pack are then generated. Usually the first cards are the most common cards and the rare cards appear as one of the last cards in a pack.

In a regular pack, the fourth card has a 0.04% chance of being Mew; the fifth card has a 0.16% chance of being Mew. The other cards will never be Mew. In a rare pack, every card has a probability of 118\frac{1}{18} to be Mew.

Each card is independent from the other. In particular, it is possible that two cards are the same.

Determine the probability...

  1. ...that there is at least one Mew in a regular pack.

  2. ...that there is exactly one Mew in a regular pack.

  3. ...that there is at least one Mew in a rare pack.

  4. ...that there is exactly one Mew in a rare pack.

  5. ...that there is at least one Mew in a random pack you open.

  6. ...that there is exactly one Mew in a random pack you open.

  7. How many packs do you have to open, so that the probability to get at least one Mew is at least 90%?

Show
Solutions E
  1. P(at least one Mew in a regular)=199.96%99.84%=0.199936%P(\text{at least one Mew in a regular})=1- 99.96\%\cdot 99.84\%= 0.199936\%
  2. P(exactly one Mew in a regular)=199.96%99.84%0.04%0.16%=0.199872%P(\text{exactly one Mew in a regular})=1- 99.96\%\cdot 99.84\% - 0.04\%\cdot 0.16\%= 0.199872\% or P(exactly one Mew in a regular)=0.04%99.84%+99.96%0.16%=0.199872%P(\text{exactly one Mew in a regular})=0.04\%\cdot 99.84\% + 99.96\%\cdot 0.16\%= 0.199872\%
  3. P(at least one Mew in a rare)=1(1718)524.86%P(\text{at least one Mew in a rare})=1- \left(\frac{17}{18}\right)^5 \approx 24.86\%
  4. P(exactly one Mew in a rare)=binompdf(5,118,1)=(51)(118)(1718)422.10%P(\text{exactly one Mew in a rare})=\operatorname{binompdf}\left(5, \frac{1}{18}, 1\right) = \binom{5}{1}\cdot\left(\frac{1}{18}\right)\cdot\left(\frac{17}{18}\right)^4\approx 22.10\%
  5. P(at least one Mew)0.05%24.86%+99.95%0.199936%0.012%+0.199836%0.212%P(\text{at least one Mew})\approx 0.05\%\cdot 24.86\% + 99.95\%\cdot 0.199936\%\approx 0.012\% + 0.199836\%\approx 0.212\%
  6. P(exactly one Mew)0.05%22.10%+99.95%0.199872%0.011%+0.19977%0.2108%P(\text{exactly one Mew})\approx 0.05\%\cdot 22.10\% + 99.95\%\cdot 0.199872\%\approx 0.011\% + 0.19977\%\approx 0.2108\%
  7. You need to open at least 1084 booster packs:
90%1(10.212%)n(0.9978%)n10%nlog0.9978(0.1)1083.6\begin{align*} 90\% & \leq 1- (1-0.212\%)^n \\ (0.9978\%)^n & \leq 10 \% \\ n & \geq \log_{0.9978}(0.1) \approx 1083.6 \end{align*}