Let be a random variable. In what follows, we will use some familiar shorthand:
,
Let denote the deviation of from its mean. Then the variance of can be written as
🎥 See More
13.1.1Variance of a Sum¶
Now let and be two random variables on the same space, and let . Then , and the deviation of is the sum of the deviations of and :
This gives us some insight into the variance of the sum .
The first thing to note is that while the expectation of a sum is the sum of the expectations, the calculation above shows that the variance of a sum is in general not the sum of the variances. There’s an extra term.
To calculate the variance of a sum, we have to understand that extra term.
13.1.2Covariance¶
The covariance of and , denoted , is the expected product of the deviations of and :
The expectation and variance of are based on the distribution of alone. The expectation and variance of are based on the distribution of alone. But covariance depends on the joint distribution of and and thus takes into account the relation between and .
Covariance has two main uses. First, it is a tool for calculating the variance of a sum. The fundamental calculation is the one we did above. Here is the result again, using the language of covariance.
Answer
Answer
Both answers are
The focus of this chapter is utilizing covariance to find variances of sums. But covariance has a second important application, which we will study later in the course. Here is a preview.
🎥 See More
13.1.3Correlation¶
Covariance has strange units. If is measured in pounds and in inches then is measured in pound-inches which are hard to understand. But we can get rid of the units of covariance by dividing it by the two standard deviations, and then something wonderful happens.
This is the mean of the products of standard units which you will recognize from Data 8 as the definition of correlation.
The correlation between random variables and is defined as the normalized covariance:
As you know, correlation is widely used in data analysis and inference. We will return to it when we study prediction. For now, you will just establish its basic properties in exercises.