Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

🎥 See More
Loading...

Let ff be a non-negative function on the real number line and suppose

f(x)dx =1\int_{-\infty}^\infty f(x)dx ~ = 1

Then ff is called a probability density function or just density for short.

In the next section we will discuss the reason behind the name. For now, imagine the graph of ff as a kind of continuous probability histogram. We will soon make that precise, but notice that by definition the total area under a density curve has to be 1.

As an example, the function ff defined by

f(x)={0                  if x06x(1x)     if 0<x<10                  if x1f(x) = \begin{cases} 0 ~~~~~~~~~~~~~~~~~~ \text{if } x \le 0 \\ 6x(1-x) ~~~~~ \text{if } 0 < x < 1 \\ 0 ~~~~~~~~~~~~~~~~~~ \text{if } x \ge 1 \\ \end{cases}

is a density. It is easy to check by calculus that it integrates to 1.

Note: The calculus used in this text is very straightforward. You should be able to do it easily by hand. Later in this chapter we will give you some Python tools for calculus. We will also show how understanding probability can help us do calculus quickly.

Here is a graph of the function ff. The density puts all the probability on the unit interval.

<Figure size 432x288 with 1 Axes>

15.1.1Density is Not the Same as Probability

In the example above, f(0.5)=6/4=1.5>1f(0.5) = 6/4 = 1.5 > 1. Indeed, there are many values of xx for which f(x)>1f(x) > 1. So the values of ff are clearly not probabilities.

Then what are they? We’ll study that in the next section. In this section we will see that we can work with densities just as we did with the normal curve.

First, a labor-saving device: If ff is positive only on a subinterval of the line, then usually we will just write its definition on the interval where it is positive. It will be assumed to be 0 elsewhere.

f(x) = 6x(1x),   0<x<1f(x) ~ = ~ 6x(1-x), ~~~ 0 < x < 1

And we will draw the graph of ff only over the region where it is positive:

<Figure size 432x288 with 1 Axes>

15.1.2Areas are Probabilities

A random variable XX is said to have density ff if for every pair a<ba < b,

P(a<Xb) = abf(x)dxP(a < X \le b) ~ = ~ \int_a^b f(x)dx

This integral is the area between aa and bb under the density curve. The graph below shows the area corresponding to P(0.6<X0.8)P(0.6 < X \le 0.8) for a random variable XX that has the density in our example.

<Figure size 432x288 with 1 Axes>

The area is

P(0.6<X0.8) = 0.60.86x(1x)dx = 0.248P(0.6 < X \le 0.8) ~ = ~ \int_{0.6}^{0.8} 6x(1-x)dx ~ = ~ 0.248
🎥 See More
Loading...

15.1.3Cumulative Distribution Function (CDF)

The cdf of XX is the function FF defined by

F(x) = P(Xx) = xf(s)dsF(x) ~ = ~ P(X \le x) ~ = ~ \int_{-\infty}^x f(s)ds

You are already familiar with the definition F(x)=P(Xx)F(x) = P(X \le x). What’s new is that we can compute the probability by integrating the density function.

In our example, the only possible values of the random variable XX are between 0 and 1, so F(x)=0F(x) = 0 for x0x \le 0 and F(x)=1F(x) = 1 for x1x \ge 1. For xx between 0 and 1,

F(x) = 0x6s(1s)ds = 3x22x3F(x) ~ = ~ \int_0^x 6s(1-s)ds ~ = ~ 3x^2 - 2x^3
<Figure size 432x288 with 1 Axes>

In terms of the graph of the density, F(x)F(x) is all the area to the left of xx under the density curve. The graph below shows the area corresponding to F(0.8)F(0.8).

<Figure size 432x288 with 1 Axes>
P(X0.8) = F(0.8) = 30.8220.83 = 0.896P(X \le 0.8) ~ = ~ F(0.8) ~ = ~ 3\cdot0.8^2 - 2\cdot0.8^3 ~ = ~ 0.896

As before, the cdf can be used to find probabilities of intervals. For every pair a<ba < b,

P(a<Xb) = F(b)F(a)P(a < X \le b) ~ = ~ F(b) - F(a)
<Figure size 432x288 with 1 Axes>
F(0.6) = 30.6220.63 = 0.648F(0.8)F(0.6) = 0.8960.648 = 0.248\begin{align*} F(0.6) ~ &= ~ 3\cdot0.6^2 - 2\cdot0.6^3 ~ = ~ 0.648 \\ F(0.8) - F(0.6) ~ &= ~ 0.896 - 0.648 ~ = ~ 0.248 \end{align*}

That’s the same as the answer we got earlier in the section by integrating the density between 0.6 and 0.8.

By the Fundamental Theorem of Calculus, the density and cdf can be derived from each other:

F(x)=xf(s)ds                  f(x)=ddxF(x)F(x) = \int_{-\infty}^x f(s)ds ~~~~~~~~~~~~~~~~~~ f(x) = \frac{d}{dx}F(x)

You can use whichever of the two functions is more convenient in a particular application.

Also keep in mind that every cdf FF satisfies some basic properties:

  • F(x)0F(x) \to 0 as xx \to -\infty

  • FF is non-decreasing

  • F(x)1F(x) \to 1 as xx \to \infty