Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

The gamma family has two important branches. The first consists of gamma (r,λ)(r, \lambda) distributions with integer shape parameter rr, as you saw in the previous section.

The other important branch consists of gamma (r,λ)(r, \lambda) distributions that have half-integer shape parameter rr, that is, when r=n/2r = n/2 for a positive integer nn. Notice that this branch contains the one above: every integer rr is also half of the integer n=2rn = 2r.

18.4.1Chi-Squared (1)(1)

We have already seen the fundamental member of the branch. Let ZZ be a standard normal random variable and let V=Z2V = Z^2. By the change of variable formula for densities, we found the density of VV to be

fV(v) = 12πv12e12v,    v>0f_V(v) ~ = ~ \frac{1}{\sqrt{2\pi}} v^{-\frac{1}{2}} e^{-\frac{1}{2} v}, ~~~~ v > 0

That’s the gamma (1/2,1/2)(1/2, 1/2) density. It is also called the chi-squared density with 1 degree of freedom, which we will abbreviate to chi-squared (1).

🎥 See More
Loading...

18.4.2From Chi-Squared (1)(1) to Chi-Squared (n)(n)

When we were establishing the properties of the standard normal density, we discovered that if Z1Z_1 and Z2Z_2 are independent standard normal then Z12+Z22Z_1^2 + Z_2^2 has the exponential (1/2)(1/2) distribution. We saw this by comparing two different settings in which the Rayleigh distribution arises. But that wasn’t a particularly illuminating reason for why Z12+Z22Z_1^2 + Z_2^2 should be exponential.

But now we know that the sum of independent gamma variables with the same rate is also gamma; the shape parameter adds up and the rate remains the same. Therefore Z12+Z22Z_1^2 + Z_2^2 is a gamma (1,1/2)(1, 1/2) variable. That’s the same distribution as exponential (1/2)(1/2), as you showed in exercises. This explains why the sum of squares of two i.i.d. standard normal variables has the exponential (1/2)(1/2) distribution.

If Z1,Z2,Z3Z_1, Z_2, Z_3 are i.i.d. standard normal variables, then:

  • Z12Z_1^2 has the gamma (1/2,1/2)(1/2, 1/2) distribution

  • Z12+Z22Z_1^2 + Z_2^2 has the gamma (1/2+1/2,1/2)(1/2 + 1/2, 1/2) distribution

  • Z12+Z22+Z32Z_1^2 + Z_2^2 + Z_3^2 has the gamma (1/2+1/2+1/2,1/2)(1/2 + 1/2 + 1/2, 1/2) distribution

Now let Z1,Z2,,ZnZ_1, Z_2, \ldots, Z_n be i.i.d. standard normal variables. Then Z12,Z22,,Zn2Z_1^2, Z_2^2, \ldots, Z_n^2 are i.i.d. chi-squared (1)(1) variables. That is, each of them has the gamma (1/2,1/2)(1/2, 1/2) distribution.

By induction, Z12+Z22++Zn2Z_1^2 + Z_2^2 + \cdots + Z_n^2 has the gamma (n/2,1/2)(n/2, 1/2) distribution. This is called the chi-squared distribution with nn degrees of freedom, which we will abbreviate to chi-squared (n)(n).

In data science, these distributions often arise when we work with the sum of squares of normal errors. This is usually part of a mean squared error calculation.

18.4.3Chi-Squared Distribution with nn Degrees of Freedom

For a positive integer nn, the random variable XX has the chi-squared distribution with nn degrees of freedom if the distribution of XX is gamma (n/2,1/2)(n/2, 1/2). That is, XX has density

fX(x) = 12n2Γ(n2)xn21e12x,    x>0f_X(x) ~ = ~ \frac{\frac{1}{2}^{\frac{n}{2}}}{\Gamma(\frac{n}{2})} x^{\frac{n}{2} - 1} e^{-\frac{1}{2}x}, ~~~~ x > 0

Here are the graphs of the chi-squared densities for degrees of freedom 2 through 5.

<Figure size 432x288 with 1 Axes>

The chi-squared (2) distribution is exponential because it is the gamma (1,1/2)(1, 1/2) distribution. This distribution has three names:

  • chi-squared (2)

  • gamma (1, 1/2)

  • exponential (1/2)

18.4.4Mean and Variance

You know that if TT has the gamma (r,λ)(r, \lambda) density then

E(T) = rλ            SD(T)=rλE(T) ~ = ~ \frac{r}{\lambda} ~~~~~~~~~~~~ SD(T) = \frac{\sqrt{r}}{\lambda}

If XX has the chi-squared (n)(n) distribution then XX is gamma (n/2,1/2)(n/2, 1/2). So

E(X) = n/21/2 = nE(X) ~ = ~ \frac{n/2}{1/2} ~ = ~ n

Thus the expectation of a chi-squared random variable is its degrees of freedom.

The SD is

SD(X) = n/21/2 = 2nSD(X) ~ = ~ \frac{\sqrt{n/2}}{1/2} ~ = ~ \sqrt{2n}

18.4.5Estimating the Normal Variance

Suppose X1,X2,,XnX_1, X_2, \ldots, X_n are i.i.d. normal (μ,σ2)(\mu, \sigma^2) variables, and that you are in a setting in which you know μ\mu and are trying to estimate σ2\sigma^2.

Let ZiZ_i be XiX_i in standard units, so that Zi=(Xiμ)/σZ_i = (X_i - \mu)/\sigma. Define the random variable TT as follows:

T = i=1nZi2 = 1σ2i=1n(Xiμ)2T ~ = ~ \sum_{i=1}^n Z_i^2 ~ = ~ \frac{1}{\sigma^2}\sum_{i=1}^n (X_i - \mu)^2

Then TT has the chi-squared (n)(n) distribution and E(T)=nE(T) = n. Now define WW by

W = σ2nT = 1ni=1n(Xiμ)2W ~ = ~ \frac{\sigma^2}{n} T ~ = ~ \frac{1}{n} \sum_{i=1}^n (X_i - \mu)^2

Then WW can be computed based on the sample since μ\mu is known. And since WW is a linear tranformation of TT it is easy to see that E(W)=σ2E(W) = \sigma^2.

So we have constructed an unbiased estimate of σ2\sigma^2. It is the mean squared deviation from the known population mean.

But typically, μ\mu is not known. In that case you need a different estimate of σ2\sigma^2 since you can’t compute WW as defined above. You showed in exercises that

S2 = 1n1i=1n(XiXˉ)2S^2 ~ = ~ \frac{1}{n-1}\sum_{i=1}^n (X_i - \bar{X})^2

is an unbiased estimate of σ2\sigma^2 regardless of the distribution of the XiX_i’s. When the XiX_i’s are normal, as is the case here, it turns out that S2S^2 is a linear transformation of a chi-squared (n1)(n-1) random variable. We will show that later in the course.

18.4.6“Degrees of Freedom”

The example above helps explain the strange term “degrees of freedom” for the parameter of the chi-squared distribution.

  • When μ\mu is known, you have nn independent centered normals (Xiμ)(X_i - \mu) that you can use to estimate σ2\sigma^2. That is, you have nn degrees of freedom in constructing your estimate.

  • When μ\mu is not known, you are using all nn of X1Xˉ,X2Xˉ,,XnXˉX_1 - \bar{X}, X_2 - \bar{X}, \ldots, X_n - \bar{X} in your estimate, but they are not independent. They are the deviations of the list X1,X2,,XnX_1, X_2, \ldots , X_n from their average Xˉ\bar{X}, and hence their sum is 0. If you know n1n-1 of them, the final one is determined. So you only have n1n-1 degrees of freedom.