The probability mass function and probability density, cdf, and survival functions are all ways of specifying the probability distribution of a random variable. They are all defined as probabilities or as probability per unit length, and thus have natural interpretations and visualizations.
But there are also more abstract ways of describing distributions. One that you have encountered is the probability generating function (pgf), which we defined for random variables with finitely many non-negative integer values.
We now define another such transform of a distribution. More general than the pgf, it is a powerful tool for studying distributions.
Let be a random variable. The moment generating function (mgf) of is a function defined on the real numbers by the formula
for all for which the expectation is finite. It is a fact (which we will not prove) that the domain of the mgf has to be an interval, not necessarily finite but necessarily including 0 because .
For with finitely many non-negative integer values, we had defined the pgf by . Notice that this is a special case of the mgf with and hence positive. For a random variable that has both a pgf and an mgf , the two functions are related by . Therefore the properties of near 0 reflect the properties of near 1.
Answer
This section presents three ways in which the mgf is useful. Other ways are demonstrated in the subsequent sections of this chapter. Much of what we say about mgf’s will not be accompanied by complete proofs as the math required is beyond the scope of this class. But the results should seem reasonable, even without formal proofs.
We will list the three ways first, and then use them all in examples.
🎥 See More
19.2.1Generating Moments¶
For non-negative integers , the expectation is called th moment of . You saw in Data 8 and again in this course that the mean is the center of gravity of the probability histogram of . In physics, the center of mass is called the first moment. The terminology of moments is used in probability theory as well.
In this course we are only going to work with mgf’s that are finite in some interval around 0. The interval could be the entire real line. It is a fact that if the mgf is finite around 0 (not just to one side of 0), then all the moments exist.
Expand to see that
by blithely switching the expectation and the infinite sum. This requires justification, which we won’t go into.
Continue to set aside questions about whether we can switch infinite sums with other operations. Just go ahead and differentiate term by term. Let denote the th derivative. Then
and hence
Now differentiate to see that , and, by induction,
Hence we can generate the moments of by evaluating successive derivatives of at . This is one way in which mgf’s are helpful.
19.2.2Identifying the Distribution¶
In this class we have made heavy use of the first and second moments, and no use at all of the higher moments. That will continue to be the case. But mgf’s do involve all the moments, and this results in a property that is very useful for proving facts about distributions.
If two distributions have the same mgf, then they must be the same distribution. This property is valid if the mgf exists in an interval around 0, which we assumed earlier in this section.
For example, if you recognize the mgf of a random variable as the mgf of a normal distribution, then the random variable must be normal.
By contrast, if you know the expectation of a random variable you can’t identify the distribution of the random variable; even if you know both the mean and the SD (equivalently, the first and second moments), you can’t identify the distribution. But if you know the moment generating function, and hence all the moments, then you can.
19.2.3Working Well with Sums¶
The third reason mgf’s are useful is that like the pgf, the mgf of the sum of independent random variables is easily computed as a product.
Let and be independent. Then
So if and are independent,
It’s time for some examples. Remember that the mgf of is the expectation of a function of . In some cases we will calculate it using the non-linear function rule for expectations. In other cases we will use the multiplicative property of the mgf of the sum of independent random variables.
19.2.4MGFs of Some Discrete Random Variables¶
19.2.4.1Bernoulli ¶
and . So
19.2.4.2Binomial ¶
A binomial random variable is the sum of i.i.d. indicators. So
19.2.4.3Poisson ¶
This one is an exercise.
You can also use this to show that the sum of independent Poisson variables is Poisson.
19.2.5MGF of a Gamma Random Variable¶
Let have the gamma distribution. Then
19.2.6Sums of Independent Gamma Variables with the Same Rate¶
If has gamma distribution and independent of has gamma distribution, then
That’s the mgf of the gamma distribution. Because the mgf identifies the distribution, must have the gamma distribution.
This is what we observed in an earlier section by simulation, using numerical values of and .
Answer
gamma
🎥 See More
19.2.7Note on Existence¶
Let be a random variable. For all , the random variable is positive, so is either positive or .
The rough statements below should give you a sense of the connection between the tails of the distribution of and the existence of the mgf. We will not cover the proofs.
If then is large for large positive values of . So if is finite for a positive , then the right hand tail of the distribution of can’t be heavy.
If then is large for large negative values of . So if is finite for a negative , then the left hand tail of the distribution of can’t be heavy.
So if is finite for a positive value of as well as for a negative value of , then both of the tails aren’t heavy.
It can be shown that if is finite for some , then is finite for all between 0 and . So being finite for a positive as well as for a negative is equivalent to being finite on an interval around 0. The interval might be very small, but as long as it straddles 0 all the properties listed in this section hold.