Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

🎥 See More
Loading...

A function ff on the plane is called a joint density if:

  • f(x,y)0f(x, y) \ge 0 for all xx, yy

  • f(x,y)dydx=1\int_{-\infty}^\infty \int_{-\infty}^\infty f(x, y)dydx = 1

If you think of ff as a surface, then the first condition says that the surface is on or above the plane. The second condition says that the total volume under the surface is 1.

Think of probabilities as volumes under the surface, and define ff to be the joint density of random variables XX and YY if

P((X,Y)A) = Af(x,y)dydx     for all AP((X, Y) \in A) ~ = ~ \mathop{\int \int}_A f(x,y)dydx ~~~~~ \text{for all } A

That is, the chance that the random point (X,Y)(X, Y) falls in the region AA is the volume under the joint density surface over the region AA.

This is a two-dimensional analog of the fact that probabilities involving a single continuous random variable can be thought of as areas under the density curve.

🎥 See More
Loading...

Infinitesimals

Also analogous is the interpretation of the joint density as part of the calculation of the probability of an infinitesimal region.

joint density matchstick

The infinitesimal region is a tiny rectangle in the plane just around the point (x,y)(x, y). Its width is dxdx and its length is dydy. The corresponding volume is that of a rectangular box whose base is the tiny rectangle and whose height is f(x,y)f(x, y).

Thus for all xx and yy,

P(Xdx,Ydy)  f(x,y)dxdyP(X \in dx, Y \in dy) ~ \sim ~ f(x, y)dxdy

and the joint density measures probability per unit area:

f(x,y)  P(Xdx,Ydy)dxdyf(x, y) ~ \sim ~ \frac{P(X \in dx, Y \in dy)}{dxdy}

An example will help us visualize all this. Let ff be defined as follows:

f(x,y) = {120x(yx)(1y),   0<x<y<10        otherwisef(x, y) ~ = ~ \begin{cases} 120x(y-x)(1-y), ~~~ 0 < x < y < 1 \\ 0 ~~~~~~~~ \text{otherwise} \end{cases}

For now, just assume that this is a joint density, that is, it integrates to 1. Let’s first take a look at what the surface looks like.

Plotting the Surface

To do this, we will use a 3-dimensional plotting routine. First, we define the joint density function. For use in our plotting routine, this function must take xx and yy as its inputs and return the value f(x,y)f(x, y) as defined above.

def joint(x,y):
    if y < x:
        return 0
    else:
        return 120 * x * (y-x) * (1-y)

Now we will call Plot_3d to plot the surface. The arguments are the limits on the xx and yy axes, the name of the function to be plotted, and two optional arguments rstride and cstride that determine how many grid lines to use. Larger numbers correspond to less frequent grid lines.

Plot_3d(x_limits=(0,1), y_limits=(0,1), f=joint, cstride=4, rstride=4)
<matplotlib.figure.Figure at 0x1a111f5080>

The surface has level 0 in the lower right hand triangle because it is not possible for the point (X,Y)(X, Y) to be in that region.

The possible values of (X,Y)(X, Y) are the blue region shown below. For calculations by hand, we will frequently draw just the possible values and not the surface.

<matplotlib.figure.Figure at 0x1a144854e0>
🎥 See More
Loading...

The Total Volume Under the Surface

The function ff looks like a bit of a mess but it is easy to see that it is non-negative. It’s a good idea to check that the total probability under the surface is equal to 1.

To set up the double integral over the entire region of possible values, notice that for each fixed value of yy, the value of xx goes from 0 to yy. The values of yy go from 0 to 1.

We will first fix yy and integrate with respect to xx. Then we will integrate with respect to yy. Thus the total integral is

  010y120x(yx)(1y)dxdy= 12001(1y)(0y(xyx2)dx)dy= 2001(1y)y3dy = 20(1415) = 1\begin{align*} & ~~ \int_0^1 \int_0^y 120x(y-x)(1-y)dxdy \\ &= ~ 120 \int_0^1 (1-y) \Big(\int_0^y (xy - x^2)dx\Big)dy \\ &= ~ 20 \int_0^1 (1-y)y^3 dy ~ = ~ 20 \big( \frac{1}{4} - \frac{1}{5}\big) ~ = ~ 1 \end{align*}

By SymPy

To use SymPy we must create two symbolic variables. As our variables are positive, we can specify that as well. Then we can assign the expression that defines the function to the name f. This specification doesn’t say that x<yx < y but we will enforce that condition when we integrate.

x = Symbol('x', positive=True)
y = Symbol('y', positive=True)

f = 120*x*(y-x)*(1-y)
f
Loading...

Displaying the double integral requires a call to Integral that specifies the inner integral first and then the outer. The call says:

  • The function being integrated is ff.

  • The inner integral is over the variable xx which goes from 0 to y.

  • The outer integral is over the variable yy which goes from 0 to 1.

Integral(f, (x, 0, y), (y, 0, 1))
Loading...

To evaluate the integral, use doit():

Integral(f, (x, 0, y), (y, 0, 1)).doit()
Loading...
🎥 See More
Loading...

Probabilities as Volumes

Probabilities are volumes under the joint density surface. In other words, they are double integrals of the function ff. For each probability, we have to first identify the region of integration, which we will do by geometry and by inspecting the event. Once we have set up the integral, we have to calculate its value, which we will do by calculus or SymPy.

Example 1

Suppose you want to find P(Y>4X)P(Y > 4X). The event is the blue region in the graph below.

<matplotlib.figure.Figure at 0x1a1454aef0>

To find the region of integration, notice that for fixed yy, the value of xx goes from 0 to y/4y/4. So

P(Y>4X) = 010y/4120x(yx)(1y)dxdy= 12001(1y)(0y/4(xyx2)dx)dy= 120(1321192)01(1y)y3dy= 120(1321192)120 = 0.15625\begin{align*} P(Y > 4X) ~ &= ~ \int_0^1 \int_0^{y/4} 120x(y-x)(1-y)dxdy \\ &= ~ 120 \int_0^1 (1-y) \Big( \int_0^{y/4} (xy - x^2)dx \Big)dy \\ &= ~ 120\Big( \frac{1}{32} - \frac{1}{192} \Big) \int_0^1 (1-y) y^3dy \\ &= ~ 120\Big( \frac{1}{32} - \frac{1}{192}\Big) \cdot \frac{1}{20} ~ = ~ 0.15625 \end{align*}

By SymPy

Integral(f, (x, 0, y/4), (y, 0, 1))
Loading...
Integral(f, (x, 0, y/4), (y, 0, 1)).doit()
Loading...
5/32
Loading...

Example 2

Suppose you want to find P(X>0.25,Y>0.5)P(X > 0.25, Y > 0.5). The event is the colored region below.

<matplotlib.figure.Figure at 0x1a1444d7f0>

Now P(X>0.25,Y>0.5)P(X > 0.25, Y > 0.5) is the integral of the joint density function over this region. Notice that for each fixed value of y>0.5y > 0.5, the value of xx in this event goes from 0.25 to yy. So let’s integrate xx first and then yy, as we did in the previous calculations.

P(X>0.25,Y>0.5) = 0.510.25y120x(yx)(1y)dxdyP(X > 0.25, Y > 0.5) ~ = ~ \int_{0.5}^1 \int_{0.25}^y 120x(y-x)(1-y)dxdy

You can crank that out by hand or use SymPy.

Integral(f, (x, 0.25, y), (y, 0.5, 1))
Loading...
Integral(f, (x, 0.25, y), (y, 0.5, 1)).doit()
Loading...
🎥 See More
Loading...

Expectation

Let gg be a function on the plane. Then

E(g(X,Y)) = yxg(x,y)f(x,y)dxdyE(g(X, Y)) ~ = ~ \int_y \int_x g(x, y)f(x, y)dxdy

provided the integral exists, in which case it can be carried out in either order: xx first and then yy, or the other way around.

This is an application of the non-linear function rule for expectation, applied to two random variables with a joint density.

As an example, let’s find E(YX)E(\frac{Y}{X}) for XX and YY with the joint density ff used in the examples above.

Here g(x,y)=yxg(x, y) = \frac{y}{x}, and

E(YX) = yxg(x,y)f(x,y)dxdy= 010yyx120x(yx)(1y)dxdy= 12001y(1y)(0y(yx)dx)dy= 6001y3(1y)dy = 60120 = 3\begin{align*} E\big(\frac{Y}{X}\big) ~ &= ~ \int_y \int_x g(x, y)f(x, y)dxdy \\ \\ &= ~ \int_0^1 \int_0^y \frac{y}{x} 120x(y-x)(1-y)dx dy \\ \\ &= ~ 120 \int_0^1 y(1-y) \Big(\int_0^y (y-x)dx \Big)dy \\ &= ~ 60 \int_0^1 y^3(1-y) dy ~ = ~ 60 \cdot \frac{1}{20} ~ = ~ 3 \end{align*}

By SymPy

Remember that x and y have already been defined as symbolic variables that are positive.

ev_y_over_x = Integral((y/x)*f, (x, 0, y), (y, 0, 1))
ev_y_over_x
Loading...
ev_y_over_x.doit()
Loading...