The normal distribution
We discuss now the most important probability density function. It is used for data that groups around a single value. Here is the definition:
A continuous random variable is called normally distributed with mean and standard deviation , if the probability density function of is
Note that the mean and the standard deviation appear in the definition of the density function. Depending on these values (called parameter of the function), the graph of the function will look different.
The simplest case
Let us focus for the moment on the simplest case where and ( does not make sense, or at least is not interesting). We then get
which looks like a bell (see below), and for this reason is also called the bell curve.
Let's discuss some key features of this graph.
a)
At we get
which is easily seen on the figure above as well.
b)
To see that has a maximum at , we can invoke differential calculus. We have to show that and . See the exercise 1 below.
c)
And if we write
we see that with approaching of , the graph approaches the -axis, but is always .
d)
We define the width of the normal distribution as the distance between the inflection points of the graph (indicated above by the points and ). Recall that an inflection point is defined as a point on the graph where the slope of the graph changes from increasing to decreasing or vice versa:
Using differential calculus we can show that und . Thus, the width of the graph is (see exercise 1 below).
e)
Recall that the probability density function follows the height of the bars in histograms (the density), und therefore is proportional to the relative frequency of the data points in each bin. So we see that the graph of describes data which is grouped around and thins out in both directions.
f)
For it is
(thus, the mean is indeed ) and
(the standard deviation is indeed ).
Again, see problem 1 for the proof.
Show that for the following is correct:
-
it has a local maximum at
-
the inflections points have the coordinates and
-
Solution
Applying the chain rule, we get for the derivative of the following:
and for the second derivative we get
-
To find the maximum, we have to find an with
that is
and this is only possible for . And because , we are indeed talking about a maximum. The -coordinate of the peak is
Thus we have
-
To find the inflection points we have to find with
Thus find with
we see that this is possible for , that is, if or . Using the calculator, we get . We thus get and .
To be sure that these are inflection points, we should also calculate the third derivative and check that and . This is left to the reader.
-
We have
where is the anti-derivative of .
So the mean is indeed .
The general case
For the general case the properties are similar. Here they are (without proof):
Consider a normally distributed random variable with mean and standard deviation . We have the following properties of (see figure below):
-
the peak is at , and its height is
-
the inflection points and are one away from the mean, and have height
-
the width is
-
The points and , which are away from , have height
-
(the mean of is )
-
(the standard deviation of is )
So note that tells you where peak of the graph of is along the -axis, and tells you how wide the graph is. The larger is, the flatter will be the graph. This makes sense, as the total area under the graph must be .
A random variable with mean and standard deviation is normally distributed. Sketch the graph of the density function by first indicating the maximum point , the inflection points and , and the height of the graph away from .
To verify the sketch, plot the density function with the calculator (or Geogebra).
Solution
As and , we have to draw the graph of the function .
The coordinates of the points are
