Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Conditional distributions help us formalize our intuitive ideas about whether two random variables are independent of each other. Let XX and YY be two random variables, and suppose we are given the value of XX. Does that change our opinion about YY? If the answer is yes, then we will say that XX and YY are dependent. If the answer is no regardless of the given value of XX, then we will say that XX and YY are independent.

Let’s start with some examples and then move to precise definitions and results.

4.5.1Dependence

Here is the joint distribution of two random variables XX and YY. From this, what can we say about whether XX and YY are dependent or independent?

dist1
Loading...

You can see at once that if X=3X = 3 then YY can only be 0, whereas if X=2X = 2 then YY can be either 0 or 1. Knowing the value of XX changes the distribution of YY. That’s dependence.

Here is an example in which you can’t quickly determine dependence or independence by just looking at the possible values.

dist2
Loading...

But you can tell by looking at the conditional distributions of XX given YY. Two of them are the same, but the third is different. Knowing the value of YY affects the chances for XX.

dist2.conditional_dist('X', 'Y')
Loading...

It follows (and you should try to prove this), that at least some of the conditional distributions of YY given the different values of XX will also be different from each other and from the marginal of YY.

In this example, all three conditional distributions of YY given the three different values of XX are different from each other.

dist2.conditional_dist('Y', 'X')
Loading...

4.5.2Independence

Here is a joint distribution table in which you can’t immediately tell whether there is dependence.

dist3
Loading...

But look what happens when you condition XX on YY.

dist3.conditional_dist('X', 'Y')
Loading...

All the rows are the same. That is, all the conditional distributions of XX given different values of YY are the same, and hence are the same as the marginal of XX too.

Given the value of YY, the probabilities for XX don’t change at all. That’s independence.

You could have drawn the same conclusion by conditioning YY on XX:

dist3.conditional_dist('Y', 'X')
Loading...
🎥 See More
Loading...

4.5.3Independence of Two Events

The concept of independence seems intuitive, but it is possible to run into trouble by not being careful about its definition. So let’s define it formally.

There are two equivalent definitions of the independence of two events. The first encapsulates the main idea of independence, and the second is useful for calculation.

Two events AA and BB are independent if P(BA)=P(B)P(B \mid A) = P(B). Equivalently, AA and BB are independent if P(AB)=P(A)P(B)P(AB) = P(A)P(B).

4.5.4Independence of Two Random Variables

What we have observed in the examples of this section can be turned into a formal definition of independence.

Two random variables XX and YY are independent if for every value xx of XX and yy of YY,

P(Y=yX=x)=P(Y=y)P(Y = y \mid X = x) = P(Y = y)

That is, no matter what the given xx is, the conditional distribution of YY given X=xX=x is the same as if we didn’t know that X=xX=x.

Equivalently (this needs a proof, which consists of a routine application of definitions), for every yy the conditional distribution of XX given Y=yY=y is the same as if we didn’t know that Y=yY=y.

An equivalent definition in terms of the independence of events is that for any values of xx and yy, the events {X=x}\{ X=x\} and {Y=y}\{Y=y\} are independent.

That is, XX and YY are independent if for any values xx of XX and yy of YY,

P(X=x,Y=y) = P(X=x)P(Y=y)P(X = x, Y = y) ~ = ~ P(X=x)P(Y=y)

Independence simplifies the conditional probabilities in the multiplication rule.

It is a fact that if XX and YY are independent random variables, then any event determined by XX is independent of any event determined by YY. For example, if XX and YY are independent and xx is a number, then {X=x}\{X=x\} is independent of {Y>x}\{Y>x\}. Also, any function of XX is independent of any function of YY.

You can prove these facts by partitioning and then using the definition of independence. The proofs are routine but somewhat labor intensive. You are welcome to just accept the facts if you don’t want to prove them.

4.5.5Mutual Independence

Events A1,A2,AnA_1, A_2, \ldots A_n are mutually independent (or independent for short) if given that any subset of the events has occurred, the conditional chances of all other subsets remain unchanged.

That’s quite a mouthful. In practical terms it means that it doesn’t matter which of the events you know have happened; chances involving the remaining events are unchanged.

In terms of random variables, X1,X2,,XnX_1, X_2, \ldots , X_n are independent if given the values of any subset, chances of events determined by the remaining variables are unchanged.

In practice, this just formalizes statements such as “results of different tosses of a coin are independent” or “draws made at random with replacement are independent”.

Try not to become inhibited by the formalism. Notice how the theory not only supports intuition but also develops it. You can expect your probabilistic intuition to be much sharper at the end of this course than it is now!

4.5.6IID Random Variables

If random variables are mutually independent and identically distributed, they are called “i.i.d.” That’s one of the most famous acronyms in probability theory. You can think of i.i.d. random variables as draws with replacement from a population, or as the results of independent replications of the same experiment.

Calculations involving i.i.d. random variables are often straightforward. For example, suppose the distribution of XX is given by

P(X=i)=pi,   i=1,2,,nP(X = i) = p_i, ~~~ i = 1, 2, \ldots, n

where i=1npi=1\sum_{i=1}^n p_i = 1. Now let XX and YY be i.i.d. What is P(X=Y)P(X = Y)? We’ll answer this question by using the fundamental method, now in random variable notation.

P(X=Y) = i=1nP(X=i,Y=i)   (partitioning)= i=1nP(X=i)P(Y=i)   (independence)= i=1npipi   (identical distributions)= i=1npi2\begin{align*} P(X = Y) ~ &= ~ \sum_{i=1}^n P(X = i, Y = i) ~~~ \text{(partitioning)} \\ &= ~ \sum_{i=1}^n P(X = i)P(Y = i) ~~~ \text{(independence)} \\ &= ~ \sum_{i=1}^n p_i \cdot p_i ~~~ \text{(identical distributions)} \\ &= ~ \sum_{i=1}^n p_i^2 \end{align*}

The last expression is easy to calculate if you know the numerical values of all the pip_i.