Suppose is the number of heads in 100 tosses of a coin, and the number of tails. Then and are far from independent. They are linear functions of each other because .
The same is true of any fixed number of tosses: if you know the number of heads, then you also know the number of tails.
In any fixed number of Bernoulli trials, the number of successes and the number of failures are as dependent as it gets. If you know one, you know the other.
However, something remarkable happens when the number of trials is itself random and has a Poisson distribution. After we see what happens, we will be able to understand why it matters.
🎥 See More
7.2.1Randomizing the Number of Bernoulli Trials¶
Let have the Poisson distribution, let be the number of successes in i.i.d. Bernoulli trials. More formally:
Given , define to be 0 with probability 1. Given that there are no trials, there are also no successes.
For , let the conditional distribution of given be binomial .
Then the joint distribution of and is given by:
You should check that the formula is correct when .
We can sum the terms in this joint distribution appropriately to get the marginal distribution of .
🎥 See More
7.2.2A Poisson Number of Successes¶
The possible values of are with no upper limit because there is no upper limit on the possible values of . For ,
Thus the distribution of is Poisson with parameter .
Notice what we have just proved.
If the number of trials is fixed, you know that the distribution of the number of successes is binomial .
But if the the number of trials is random with a Poisson distribution, then the distribution of the number of successes is Poisson .
This is a major step in Poissonizing the binomial.
The best is yet to come, but let’s take a moment to look at the result numerically. Suppose you run a Poisson number of i.i.d. Bernoulli trials. Then the number of trials is most likely to be somewhere around 12, but you can’t say exactly what it will be because it’s random. What we have shown is that the number of successes is Poisson with parameter .
The parameter 4 is not hard to understand intuitively. You’re most likely to see around 12 trials, and about 1/3 of them are going to be successes, so you’re most likely to see around 4 successes.
Answer
(a)
(b)
🎥 See More
7.2.3Successes and Failures are Independent¶
Yes, you read that right. If you run a Poisson number of i.i.d. Bernoulli trials, then the number of successes and the number of failures are independent.
Randomizing parameters (in this case the number of trials) can have a dramatic effect on the relations between random variables.
Let’s prove our result, and then we will see a way in which it is used.
Suppose as before that we are running i.i.d. Bernoulli trials, where has the Poisson distribution independent of the results of the trials. Also as before, let be the number of successes.
Now let be the number of failures. Then the distribution of is Poisson where . This follows by redefining “success” as “failure” in our previous argument.
The joint distribution of and is
This shows that and are independent.
7.2.4Summary: Poissonization of the Binomial¶
Suppose you run i.i.d. Bernoulli trials, where has the Poisson distribution independent of the results of the trials. Let be the number of successes and the number of failures, and let . Then:
has the Poisson distribution
has the Poisson distribution
and are independent
For example, suppose 90% of the individuals in a population are of Class A and 10% are of Class B. Suppose you draw times at random with replacement from the population, where has the Poisson distribution independent of the results of your draws. Then in your sample,
the number of people of Class A has the Poisson distribution,
the number in Class B has the Poisson distribution,
and the counts in the two classes are independent.
Thus for example the chance that each class appears at least five times in your sample is
This is just over 5%.
(1 - stats.poisson.cdf(4, 18))*(1 - stats.poisson.cdf(4, 2))0.052648585218160585