Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

This section is a workout in finding expectation and variance by conditioning. As before, if you are trying to find a probability, expectation, or variance, and you think, “If only I knew the value of this other random variable, I’d have the answer,” then that’s a sign that you should consider conditioning on that other random variable.

🎥 See More
Loading...

22.4.1Mixture of Two Distributions

Let XX have mean μX\mu_X and SD σX\sigma_X. Let YY have mean μY\mu_Y and SD σY\sigma_Y. Now let pp be a number between 0 and 1, and define the random variable MM as follows.

M={X  with probability pY  with probability q=1pM = \begin{cases} X ~~ \text{with probability } p \\ Y ~~ \text{with probability } q = 1 - p \\ \end{cases}

The distribution of MM is called a mixture of the distributions of XX and YY.

One way to express the definition of MM compactly is to let IHI_H be the indicator of heads in one toss of a pp-coin; then

M=XIH+Y(1IH)M = XI_H + Y(1 - I_H)

To find the expectation of MM we can use the expression above, but here we will condition on IHI_H because we can continue with that method to find Var(M)Var(M).

The distribution table of the random variable E(MIH)E(M \mid I_H) is

ValueμX\mu_XμY\mu_Y
Probabilityppqq

The distribution table of the random variable Var(MIH)Var(M \mid I_H) is

ValueσX2\sigma_X^2σY2\sigma_Y^2
Probabilityppqq

So

E(M) = E(E(MIH)) = μXp+μYqE(M) ~ = ~ E(E(M \mid I_H)) ~ = ~ \mu_Xp + \mu_Yq

and

Var(M) = E(Var(MIH))+Var(E(MIH))= (σX2p+σY2q)+(μX2p+μY2q(E(M))2)\begin{align*} Var(M) ~ &= ~ E(Var(M \mid I_H)) + Var(E(M \mid I_H)) \\ &= ~ \big( \sigma_X^2p + \sigma_Y^2q \big) + \big( \mu_X^2p + \mu_Y^2q - (E(M))^2 \big) \end{align*}

This is true no matter what the distributions of XX and YY are.

22.4.1.1Variance of the Geometric Distribution

We have managed to come quite far into the course without deriving the variance of the geometric distribution. Let’s find it now by using the results about mixtures derived above.

Toss a coin that lands heads with probability pp and stop when you see a head. The number of tosses XX has the geometric (p)(p) distribution on {1,2,}\{ 1, 2, \ldots \}. Let E(X)=μE(X) = \mu and Var(X)=σ2Var(X) = \sigma^2. We will use conditioning to confirm that E(X)=1/pE(X) = 1/p and also to find Var(X)Var(X).

Now

X={1   with probability p1+X   with probability q=1pX = \begin{cases} 1 ~~~ \text{with probability } p \\ 1 + X^* ~~~ \text{with probability } q = 1-p \end{cases}

where XX^* is an independent copy of XX. By the previous example,

μ = E(X) = 1p+(1+μ)q\mu ~ = ~ E(X) ~ = ~ 1p + (1+\mu)q

So μ=1/p\mu = 1/p as we have known for some time.

By the variance formula of the previous example,

σ2=Var(X)=(02p+σ2q)+(12p+(1+1p)2q1p2)\sigma^2 = Var(X) = \big( 0^2p + \sigma^2q \big) + \big(1^2p + (1+\frac{1}{p})^2q - \frac{1}{p^2}\big)

So

σ2p = p3+(p+1)2q1p2 = p3+(1+p)(1p2)1p2 = p(1p)p2\sigma^2p ~ = ~ \frac{p^3 + (p+1)^2q - 1}{p^2} ~ = ~ \frac{p^3 + (1+p)(1-p^2) - 1}{p^2} ~ = ~ \frac{p(1-p)}{p^2}

and so Var(X)=σ2=q/p2Var(X) = \sigma^2 = q/p^2.

22.4.2Normal with a Normal Mean

Let MM be normal (μ,σM2)(\mu, \sigma_M^2), and given M=mM = m, let XX be normal (m,σX2)(m, \sigma_X^2).

Then

E(XM) = M,      Var(XM) = σX2E(X \mid M) ~ = ~ M, ~~~~~~ Var(X \mid M) ~ = ~ \sigma_X^2

Notice that the conditional variance is a constant: it is the same no matter what the value of MM turns out to be.

So E(X)=E(E(XM))=E(M)=μE(X) = E(E(X \mid M)) = E(M) = \mu and

Var(X) = E(Var(XM))+Var(E(XM)) = σX2+Var(M) = σX2+σM2Var(X) ~ = ~ E(Var(X \mid M)) + Var(E(X \mid M)) ~ = ~ \sigma_X^2 + Var(M) ~ = ~ \sigma_X^2 + \sigma_M^2

22.4.3Random Sum

Let NN be a random variable with values 0,1,2,0, 1, 2, \ldots, mean μN\mu_N, and SD σN\sigma_N. Let X1,X2,X_1, X_2, \ldots be i.i.d. with mean μX\mu_X and SD σX\sigma_X, independent of NN.

Define the random sum SNS_N as

SN={0  if N=0X1+X2++Xn  if N=n>0S_N = \begin{cases} 0 ~~ \text{if } N = 0 \\ X_1 + X_2 + \cdots + X_n ~~ \text{if } N = n > 0 \end{cases}

Then as we have seen before, E(SNN=n)=nμXE(S_N \mid N = n) = n\mu_X for all nn (including n=0n = 0). So

E(SNN) = NμXE(S_N \mid N) ~ = ~ N\mu_X

and hence

E(SN) = E(NμX) = μXE(N) = μNμXE(S_N) ~ = ~ E(N\mu_X) ~ = ~ \mu_XE(N) ~ = ~ \mu_N\mu_X

This is consistent with intuition: you expect to be adding μN\mu_N i.i.d. random variables, each with mean μX\mu_X. For the variance, intuition needs some guidance, which is provided by our variance decomposition formula.

First note that because we are adding i.i.d. random variables, Var(SNN=n)=nσX2Var(S_N \mid N = n) = n\sigma_X^2 for all nn (including n=0n = 0). That is,

Var(SNN) = NσX2Var(S_N \mid N) ~ = ~ N\sigma_X^2

By the variance decomposition formula,

Var(SN) = E(NσX2)+Var(NμX) = μNσX2+μX2σN2Var(S_N) ~ = ~ E(N\sigma_X^2) + Var(N\mu_X) ~ = ~ \mu_N\sigma_X^2 + \mu_X^2\sigma_N^2