Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Let Z1,Z2,,ZnZ_1, Z_2, \ldots, Z_n be i.i.d. standard normal variables and let Z=[Z1 Z2Zn]T\mathbf{Z} = [Z_1 ~ Z_2 \cdots Z_n]^T.

We know that every linear combination of the elements of Z\mathbf{Z} is normal. We will now look at multiple linear combinations.

Let A\mathbf{A} be an m×nm \times n matrix of real numbers and let b\mathbf{b} be an m×1m \times 1 real vector. The m×1m \times 1 linear transformation X=AZ+b\mathbf{X} = \mathbf{AZ} + \mathbf{b} is called a multivariate normal random vector.

For example, consider the bivariate case where n=2n=2, and let

X = [X1X2] = [Z1Z1+Z2]\mathbf{X} ~ = ~ \begin{bmatrix} X_1 \\ X_2 \end{bmatrix} ~ = ~ \begin{bmatrix} Z_1 \\ Z_1 + Z_2 \end{bmatrix}

This is the linear transformation AZ+b\mathbf{AZ} + \mathbf{b} where b=0\mathbf{b}=0 and

A = [1011]\mathbf{A} ~ = ~ \begin{bmatrix} 1 & 0 \\ 1 & 1 \end{bmatrix}

A scatter plot of simulated values of X\mathbf{X} is shown below. Notice the oval shape that is familiar from Data 8.

z1 = stats.norm.rvs(size=1000)
z2 = stats.norm.rvs(size=1000)
x = Table().with_columns(
    'X1', z1,
    'X2', z1+z2
)
x.scatter('X1', 'X2')
<Figure size 360x360 with 1 Axes>

Later in this chapter you will see how the oval shape arises. For now, let us see what the definition of a multivariate normal vector implies.

23.2.1Linear Transformation

If X\mathbf{X} is a multivariate normal vector, then it is a linear transformation of a vector Z\mathbf{Z} of i.i.d. standard normal variables. Therefore any linear transformation of X\mathbf{X} is another linear transformation of Z\mathbf{Z} and hence is also multivariate normal.

In other words, a linear transformation of a multivariate normal vector is also multivariate normal. In particular, every linear combination of elements of X\mathbf{X} is normal.

The mean and variance of the linear transformation follow from properties of means and variances – the fact that X\mathbf{X} is multivariate normal doesn’t matter. We can use properties of means and variances to find the mean vector and covariance matrix of any linear transformation of X\mathbf{X}, in terms of the mean vector μ\boldsymbol{\mu} and covariance matrix Σ\boldsymbol{\Sigma} of X\mathbf{X}.

What we gain from X\mathbf{X} being multivariate normal are the shapes of the distributions of linear transformations: they are normal. This allows us to find probabilities using the normal curve.

🎥 See More
Loading...

Here is an example in two dimensions.

23.2.2Sum and Difference

Let X=[X1 X2]T\mathbf{X} = [X_1 ~ X_2]^T be a bivariate normal random vector with mean vector μ=[μ1 μ2]T\boldsymbol{\mu} = [\mu_1 ~ \mu_2]^T and covariance matrix Σ\boldsymbol{\Sigma}.

Consider the sum S=X1+X2S = X_1 + X_2 and the difference D=X1X2D = X_1 - X_2. Since the vector [S D]T[S ~ D]^T is a linear transformation of X\mathbf{X}, we can make the following conclusions.

  • The distribution of SS is normal. Its mean and variance can be derived from properties of means and variances by familiar calculations. The mean is μ1+μ2\mu_1 + \mu_2 and the variance is

Var(S) = Var(X1)+Var(X2)+2Cov(X1,X2)Var(S) ~ = ~ Var(X_1) + Var(X_2) + 2Cov(X_1, X_2)

which you can calculate based on Σ\boldsymbol{\Sigma}.

Since the distribution of SS is normal with a known mean and variance, you can find probabilities of events determined by SS, such as P(S>s)P(S > s).

  • The distribution of DD is normal with mean μ1μ2\mu_1 - \mu_2 and variance

Var(D) = Var(X1)+Var(X2)2Cov(X1,X2)Var(D) ~ = ~ Var(X_1) + Var(X_2) - 2Cov(X_1, X_2)

Since the distribution of DD is normal with a known mean and variance, you can find probabilities of events determined by DD, such as P(X1>X2)=P(D>0)P(X_1 > X_2) = P(D > 0).

  • The vector [S D]T[S ~ D]^T is bivariate normal. We found the mean vector and all but one element of the covariance matrix in the calculations above. The remaining element is

Cov(S,D) = Cov(X1+X2,X1X2) = Var(X1)Var(X2)Cov(S, D) ~ = ~ Cov(X_1 + X_2, X_1 - X_2) ~ = ~ Var(X_1) - Var(X_2)

by bilinearity and symmetry of covariance.

23.2.3Marginals

Let X\mathbf{X} be multivariate normal. Each component XiX_i is a linear combination of elements of X\mathbf{X}: the combination that has coefficient 1 at index ii and 0 everywhere else. So each XiX_i has the normal distribution. The parameters of this normal distribution can be read off the mean vector and covariance matrix: E(Xi)=μ(i)E(X_i) = \boldsymbol{\mu}(i) and Var(Xi)=Σ(i,i)Var(X_i) = \boldsymbol{\Sigma}(i, i).

But be warned: the converse is not true. If all the marginals of a random vector are normal, the joint distribution need not be multivariate normal.

23.2.4A Cautionary Tale

The cells below show the empirical joint and marginal distributions of an interesting data set. Read the comment at the top of each cell to see what is being computed and displayed.

# Generate 100,000 iid standard normal points

x = stats.norm.rvs(size=100000)
y = stats.norm.rvs(size=100000)
t = Table().with_columns(
    'X', x,
    'Y', y
)
# Select just those where both elements have the same sign

new = t.where(t.column(0) * t.column(1) > 0)
# The scatter of the restricted pairs
# is not oval

new.scatter(0, 1)
<Figure size 360x360 with 1 Axes>
# Empirical distribution of horizontal coordinate

new.hist(0, bins=25)
plt.xticks(np.arange(-5, 6));
<Figure size 432x288 with 1 Axes>
# Empirical distribution of vertical coordinate

new.hist(1, bins=25)
plt.xticks(np.arange(-5, 6));
<Figure size 432x288 with 1 Axes>

Both marginals are normal but the joint distribution is far from bivariate normal. Bivariate normal variables have oval scatter plots, which these variables don’t – but we haven’t yet proved that bivariate normal variables have oval scatter plots.

What we do know is the sum of bivariate normal variables is normal. So let’s see what the histogram of the sum of the two coordinates looks like.

new = new.with_columns(
    'X+Y', new.column(0)+new.column(1))
new.hist('X+Y', bins=np.arange(-6, 6.1, 0.5))
<Figure size 432x288 with 1 Axes>

That really isn’t normal. It is bimodal with very little probability around 0.

To get the formula for the joint density of these variables, start with the circularly symmetric joint density of two i.i.d. standard normals and restrict it to Quadrants 1 and 3. This leaves out half of the volume under the original surface, so remember to multiply by 2 to make the total volume under the new surface equal to 1.

def new_density(x,y):
    if x*y > 0:
        return 1/np.pi * np.exp(-0.5*(x**2 + y**2))
    else:
        return 0

Plot_3d((-4, 4), (-4, 4), new_density, rstride=4, cstride=5)
<Figure size 864x576 with 1 Axes>