1. Let and be jointly distributed random variables and let be the linear regression estimate of based on . Show that the mean squared error of this estimate is where is the correlation between and . This leads to the Data 8 formula for the SD of the residuals in simple linear regression.
[Use Alternative Form I of the regression equation, and preserve deviations as we did here.]
2. Let and be standard bivariate normal with correlation .
(a) Suppose I ask you for the least squares estimate of based on , but I don’t tell you . What is your estimate, and what is its mean squared error?
(b) Suppose I now show you . Now what is your least squares estimate of , and what is its mean squared error?
(c) What is your least squares estimate of based only on linear functions of , and what is its mean squared error?
3. Let be the weight and height of a person picked at random from a population, and suppose the distribution of is bivariate normal with correlation 0.6. Suppose also that
has mean 150 pounds and SD 25 pounds
has mean 68 inches and SD 3 inches
Sketch the conditional density of given pounds. Mark the numerical values of the conditional mean and SD appropriately on your sketch.
4. Let and have a bivariate normal distribution (not necessarily standard) with correlation . Suppose you are given that is on the 30th percentile.
(a) Pick the right option for the least squares estimate of , and explain.
(i) Below the 30th percentile
(ii) On the 30th percentile
(iii) Above the 30th percentile
(b) Write a single math expression for the percentile rank corresponding to the least squares estimate of . Your answer can involve and the standard normal cdf .
5. Let and be standard bivariate normal with correlation .
(a) Without calculation, pick the right option and explain. is
(i) less than 0.25
(ii) equal to 0.25
(iii) greater than 0.25
(b) Now find in terms of .
[No integration is needed. Write in terms of and standard normal independent of , sketch the region, and use what you know about the joint density of .]
6. Let and be standard bivariate normal with correlation . Find . The easiest way is to use the fact that for any two numbers and , . Check the fact first, and then use it.
7. Suppose that is normal ), is normal , and the two random variables are independent. Let .
(a) Find the conditional distribution of given .
(b) Find the least squares predictor of based on and provide its mean squared error.
(c) Find the least squares linear predictor of based on and provide its mean squared error.