Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Conditioning helps us find expectations of waiting times. All the examples below involve conditioning on early moves of a random process.

9.3.1Waiting till H

A coin lands heads with chance pp. Let’s call it a pp-coin for short. Let WHW_H be the number of tosses of a pp-coin till the first head appears. The use of WW in the notation is because the random variable is often called the waiting time till the first head.

We know that WHW_H has the geometric (p)(p) distribution on 1,2,3,1, 2, 3, \ldots . We derived its expectation earlier by using the Tail Sum Formula. Here is a quick way to derive E(WH)E(W_H) without using the formula for the probabilities.

🎥 See More
Loading...

The method is based on representing WHW_H in terms of a mixture of random variables. A mixture is a description of the random variable by conditioning.

  • With probability 1, at least one toss has to be made. So WH=1+RW_H = 1 + R where RR is the random number of tosses required after the first one.

  • With probability pp the first toss is a head, so R=0R = 0.

  • With the remaining probability q=1pq=1-p the first toss is a tail, and then the process starts over independently of what has happened before. That is, with probability qq, R=WR = W^* where WW^* is an identically distributed copy of WHW_H.

Let x=E(WH)x = E(W_H). By additivity and averaging conditional expectations,

x = 1+E(R) = 1+pE(0) + qE(W)=1+qxx ~ = ~ 1 + E(R) ~ = ~ 1 + pE(0) ~ + ~ qE(W^*) = 1 + qx

Solve for xx:

x=1px = \frac{1}{p}

This calculation confirms that in i.i.d. Bernoulli (p)(p) trials, the expected waiting time till the first success is 1/p1/p.

9.3.2Infinite Monkey Theorem

“The number of trials till the first success” provides the framework for a rich array of examples, because both “trial” and “success” can be defined to be much more complex than just tossing a coin and getting heads. A classic example is about a professor (or a monkey) drawing independently at random from the 26 letters of the alphabet to see if they ever get the sequence datascience. They will, with probability 1, as you can see by overestimating the number of draws they have to make.

  • Define a “trial” to be 11 letters picked at random.

  • Define a trial to be a “success” if those 11 letters are the sequence datascience.

Then the number of trials till datascience appears has the geometric distribution with parameter p=1/2611p = 1/26^{11}, and therefore has expectation 2611. That’s 2611 lots of 11 draws, which is an overestimate because you will be watching the draws sequentially and not in blocks of 11. For example, if the first block of 11 ends in data and the next block starts with science, you will have seen the sequence datascience and stopped watching, even though both of those blocks would be called failures and the trials would continue.

There is nothing special about the sequence datascience. You can replace it with any finite string of letters, no matter how long. For example, the string could be the complete works of Shakespeare. You will just have to replace 11 by the length of the string. This is popularly known as the Infinite Monkey Theorem.

9.3.3Waiting Till Both Faces Have Appeared

Suppose we toss the pp-coin until both faces have appeared. Let NN be the number of tosses.

Question: What is E(N)E(N)?

Answer: We can find E(N)E(N) by conditioning on the first toss as we did in the previous example.

  • With probability 1, N=1+MN = 1 + M where MM is the additional number of tosses needed after the first one.

  • With probability pp the first toss is a head, so M=WTM = W_T where WTW_T has the geometric (q)(q) distribution.

  • With probability qq the first toss is a tail, so M=WHM = W_H where WHW_H has the geometric (p)(p) distribution.

So

E(N)=1+p(1q)+q(1p)=1+p2+q2pq=1pqpqE(N) = 1 + p\big( \frac{1}{q} \big) + q\big(\frac{1}{p} \big) = 1 + \frac{p^2 + q^2}{pq} = \frac{1 - pq}{pq}

9.3.4Waiting till HH

In tosses of a pp-coin, let WHHW_{HH} be the number of tosses till you see two heads in a row.

Question: What is E(WHH)E(W_{HH})?

Answer 1: We can find this is several ways. One way is by conditioning on the first two tosses.

  • With probability qq, the first toss is a tail, so WHH=1+WW_{HH} = 1 + W^* where WW^* is an identically distributed copy of WHHW_{HH}.

  • With probability pqpq the first two tosses are HT, and WHH=2+WW_{HH} = 2 + W^{**} where WW^{**} is an identically distributed copy of WHHW_{HH}.

  • With probability p2p^2, the first two tosses are heads, and WHH=2W_{HH} = 2.

So if x=E(WHH)x = E(W_{HH}) then

x=q(1+x)+pq(2+x)+p22x = q(1+x) + pq(2+x) + p^22

So

x=q+2pq+2p21qpq=1+pp2x = \frac{q + 2pq + 2p^2}{1 - q - pq} = \frac{1+p}{p^2}

by repeatedly using p+q=1p + q = 1.

Answer 2: Another way is by conditioning on the toss after WHW_H where, as before, WHW_H is the number of tosses till the first head. We know that E(WH)=1/pE(W_H) = 1/p.

Now WHH=WH+VW_{HH} = W_H + V where VV is the additional number of tosses needed after WHW_H.

  • With probability pp, the toss after WHW_H is a head, so V=1V = 1.

  • With probability qq, the toss after WHW_H is a tail, so V=1+WV = 1 + W^* where WW^* is an identically distributed copy of WHHW_{HH}.

🎥 See More
Loading...

So if x=E(WHH)x = E(W_{HH}) then

x = E(WH)+E(V) = 1p+p+q(1+x)x ~ = ~ E(W_H) + E(V) ~ = ~ \frac{1}{p} + p + q(1 + x)

So

px=1p+1    and hence    x=1+pp2px = \frac{1}{p} + 1 ~~~~ \text{and hence} ~~~~ x = \frac{1+p}{p^2}

as before. Notice that the answer can also be written as

E(WHH) = 1p2+1pE(W_{HH}) ~ = ~ \frac{1}{p^2} + \frac{1}{p}

In exercises you will generalize this to a get formula for the expected waiting time till you see nn heads in a row.

9.3.5Gambler’s Ruin: Duration of the Game

Let’s return to the setting of the gambler’s ruin problem with a fair coin and positive integers a<ba < b. The gambler starts with aa dollars and bets on tosses of the coin till either his net gain reaches bb dollars or he loses all his money. Let TT be the duration of the game.

Question. What the expected duration of the game?

Answer. Let Ek(T)E_k(T) denote the expected duration of the game given that the gambler starts with a net gain of kk dollars. We want E0(T)E_0(T).

By conditioning on the first step, we see that for a+1kb1-a+1 \le k \le b-1,

Ek(T)=1+12Ek1(T)+12Ek+1(T)E_k(T) = 1 + \frac{1}{2}E_{k-1}(T) + \frac{1}{2} E_{k+1}(T)

where the edge cases are

Ea(T)=0=Ea+b(T)E_{-a}(T) = 0 = E_{a+b}(T)

You can check that the function f(k)=(bk)(k+a)f(k) = (b-k)(k+a) satisfies this recursion, and hence that E0(T)=abE_0(T) = ab.