Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Bernoulli trials come out in one of two ways. But many trials come out in multiple different ways, all of which we might want to track. A die can land six different ways. A jury member can have one of several different educational levels. In general, an individual might belong to one of several classes.

The multinomial distribution is a joint distribution that extends the binomial to the case where each repeated trial has more than two possible outcomes. Let’s look at it first in an example, and then we will define it in general.

A box contains 2 blue tickets, 5 green tickets, and 3 red tickets. Fifteen draws are made at random with replacement. To find the chance that there are 4 blue, 9 green, and 2 red tickets drawn, we could start by writing all possible sequences of 4 B’s, 9 G’s, and 2 R’s.

Each such sequence has chance 0.240.590.320.2^4 0.5^9 0.3^2, so all we need for completing the probability calculation is the number of sequences we could write.

  • There are (154)\binom{15}{4} ways of choosing places to write the B’s.

  • For each of these ways, there are (119)\binom{11}{9} ways of choosing 9 of the remaining 11 places to write the G’s.

  • The remaining 2 places get filled with R’s.

So

P(4 blue, 9 green, 2 red)=(154)(119)0.240.590.32=15!4!11!11!9!2!0.240.590.32=15!4!9!2!0.240.590.32\begin{align*} P(\text{4 blue, 9 green, 2 red}) &= \binom{15}{4} \cdot \binom{11}{9} 0.2^4 0.5^9 0.3^2 \\ \\ &= \frac{15!}{4!11!} \cdot \frac{11!}{9!2!} 0.2^4 0.5^9 0.3^2 \\ \\ &= \frac{15!}{4!9!2!} 0.2^4 0.5^9 0.3^2 \end{align*}

Notice how this simply extends the binomial probability formula by including a third category.

Analogously, or formally by induction, you can extend the formula to any finite number of categories or classes.

6.3.1Multinomial Distribution

Fix a positive integer nn. Suppose we are running nn i.i.d. trials where each trial can result in one of kk classes. For each i=1,2,,ki = 1, 2, \ldots, k, let the chance of getting Class ii on a single trial be pip_i, so that i=1kpi=1\sum_{i=1}^k p_i = 1.

For each i=1,2,,ki = 1, 2, \ldots , k, let NiN_i be the number of trials that result in Class ii, so that i=1kNi=n\sum_{i=1}^k N_i = n.

Then the joint distribution of N1,N2,,NkN_1, N_2, \ldots , N_k is given by

P(N1=n1,N2=n2,,Nk=nk) = n!n1!n2!nk!p1n1p2n2pknkP(N_1 = n_1, N_2 = n_2, \ldots , N_k = n_k) ~ = ~ \frac{n!}{n_1!n_2! \ldots n_k!}p_1^{n_1}p_2^{n_2} \cdots p_k^{n_k}

where ni0n_i \ge 0 for 1ik1 \le i \le k and i=1kni=n\sum_{i=1}^k n_i = n.

This is called the multinomial distribution with parameters nn and p1,p2,,pkp_1, p_2, \ldots, p_k.

When there are just two classes, then k=2k = 2 and the formula reduces to the familiar binomial formula written as the joint distribution of the number of successes and the number of failures:

P(N1=n1,N2=n2)=n!n1!n2!p1n1p2n2  where p1+p2=1 and n1+n2=nP(N_1 = n_1, N_2 = n_2) = \frac{n!}{n_1!n_2!} p_1^{n_1}p_2^{n_2} ~~ \text{where } p_1+p_2=1 \text{ and } n_1+n_2=n

6.3.2Binomial Marginals

No matter how many classes there are, the marginal distribution of each NiN_i is binomial (n,pi)(n, p_i).

You don’t have to sum terms in the joint distribution to work this out.

  • NiN_i is the number of Class ii individuals in the sample

  • Each sampled individual is in Class ii with probability pip_i

  • There are nn independent draws.

That’s the binomial setting.

6.3.3A Roulette Example

A Nevada roulette wheel has 18 red pockets, 18 black pockets, and 2 green pockets. Each time the wheel is spun, all 38 pockets are equally likely to win, independent of all other spins.

Question: If the wheel is spun 10 times, what is the chance that red and black win an equal number of times?

Answer: Let NrN_r be the number of times red wins, NbN_b the number of times black wins, and NgN_g the number of times green wins.

P(Nr=Nb) = i=05P(Nr=i,Nb=i,Ng=102i)= i=0510!i!i!(102i)!(1838)i(1838)i(238)102i\begin{align*} P(N_r = N_b) ~ &= ~ \sum_{i=0}^5 P(N_r = i, N_b = i, N_g = 10-2i) \\ \\ &= ~ \sum_{i=0}^5 \frac{10!}{i!i!(10-2i)!} \big(\frac{18}{38}\big)^i \big(\frac{18}{38}\big)^i \big(\frac{2}{38}\big)^{10-2i} \end{align*}