Murray R. Spiegel, Larry J. Stephens
Chapter 14
Correlation Theory - all with Video Answers
Educators
Chapter Questions
Table 14.1 shows in inches (in) the respective heights $X$ and $Y$ of a sample of 12 fathers and their oldest sons.
(a) Construct a scatter diagram.
(b) Find the least-squares regression line of $Y$ on $X$.
(c) Find the least-squares regression tine of $X$ on $Y$.
Ahmad Reda
Numerade Educator
Work Problem 14.1 using Minitab. Construct tables giving the fitted values, $\gamma_{\text {es }}$, and the residuals. Find the sum of squares for the residuals for both regression lines.
Dominador Tan
Numerade Educator
If the regression line of $Y$ on $X$ is given by $Y=a_0+a_1 X$, prove that the standard error of estimate $s_Y X$ is given by
$$
s_Y^2=\frac{\sum Y^2-a_0 \sum Y-a_1 \sum X Y}{N}
$$
Check back soon!
If $x=X-\bar{X}$ and $y=Y-\bar{Y}$, show that the result of Problem 14.3 can be written
$$
s_{Y X}^2=\frac{\sum y^2-a_1 \sum x y}{N}
$$
Check back soon!
Compute the standard error of estimate, $s_Y$, for the data of Problem 14.1 by using (a) the definition and (b) the result of Problem 14.4.
Jai Chadha
Numerade Educator
(a) Construct two lines which are parallel to the regression line of Problem 14.1 and which have a vertical distance $s_{Y X}$ from it.
(b) Determine the percentage of data points falling between these two lines.
Sheryl Ezze
Numerade Educator
Prove that $\sum(Y-\bar{Y})^2=\sum\left(Y-Y_{\text {es }}\right)^2+\sum\left(Y_{\text {ent }}-\bar{Y}\right)^2$.
Check back soon!
Compute (a) the total variation, (b) the unexplained variation, and (c) the explained variation for the data in Problem 14.1.
Check back soon!
Use the results of Problem 14.8 to find $(a)$ the coefficient of determination and $(b)$ the coefficient of correlation.
Check back soon!
Prove that for linear regression the coefficient of correlation between the variables $X$ and $Y$ can be written
$$
r=\frac{\sum x y}{\sqrt{\left(\sum x^2\right)\left(\sum y^2\right)}}
$$
where $x=X-X$ and $y=Y-Y$.
Ameer Said
Numerade Educator
Find the coefficient of linear correlation between the variables $X$ and $Y$ presented in Table 14.7.
Check back soon!
For the data of Problem 14.11, find (a) the standard deviation of $X$, (b) the standard deviation of $Y,(c)$ the variance of $X_{,}(d)$ the variance of $Y$, and (e) the covariance of $X$ and $Y$. Compare these values with the Minitab output and explain the difference in the values.
Jeremiah Mbaria
Numerade Educator
For the data of Problem 14.11, verify the formula
$$
r=\frac{s_{X Y}}{s_X{ }^5 Y}
$$
Jon Southam
Numerade Educator
By using the product-moment formula, obtain the linear correlation coefficient for the data of Problem 14.1.
Check back soon!
Show that the linear correlation coefficient is given by
$$
r=\frac{N \sum X Y-\left(\sum X\right)\left(\sum Y\right)}{\sqrt{\left.\left[N \sum X^2-\left(\sum X\right)^2\right]\left[N \sum Y^2-\sum Y\right)^2\right]}}
$$
Ameer Said
Numerade Educator
Use the formula of Problem 14.15 to obtain the linear correlation coefficient for the data of Problem 14.1.
Check back soon!
Table 14.9 shows the frequency distributions of the final grades of 100 students in mathematics and physics. Referring to this table, determine:
(a) The number of students who received grades of $70-79$ in mathematics and $80-89$ in physics.
(b) The percentage of students with mathematics grades below 70.
(c) The number of students who received a grade of 70 or more in physics and of less than 80 in mathematics.
(d) The percentage of students who passed at least one of the subjects; assume that the minimum passing grade is 60 .
Gus Steppen
Numerade Educator
Show how to modify the formula of Problem 14.15 for the case of data grouped as in the bivariate frequency table (Table 14.9).
Khoobchandra Agrawal
Numerade Educator
Find the coefficient of linear correlation of the mathematics and physics grades of Problem 14.17.
Check back soon!
Use Table 14.12 to compute (a) $s_X$, (b) $s_Y$, and (c) $s_{X Y}$ and thus to verify the formula $r=s_{X Y} /\left(s_X s_Y\right)$.
Sarah Gift
Numerade Educator
Prove that the regression lines of $Y$ on $X$ and of $X$ on $Y$ have equations given, respectively, by (a) $Y-\bar{Y}=\left(r s_Y / s_X\right)(X-\bar{Y})$ and $(b) X-\bar{X}=\left(r s_X / s_Y\right)(Y-\bar{Y})$.
Check back soon!
If, the regression lines of $Y$ on $X$ and of $X$ on $Y$ are given, respectively, by $Y=a_0+a_1 X$ and $X=b_0+b_1 Y$, prove that $a_1 b_1=r^2$
Check back soon!
Use the result of Problem 14.22 to find the linear correlation coefficient for the data of Problem 14.1 .
Check back soon!
For the data of Problem 14.19. write the cquations of the regression lines of (a) $Y$ on $X$ and (b) $X$ on $Y$.
Check back soon!
For the data of Problem 14.19, compute the standard errors of estimate (a) $s_Y x_X$ and (b) $s_X$. Use the results of Problem 14,20 .
Check back soon!
Table 14.13 shows the U.S. consumer price indexes for food and medical care costs during the years 1990-1996 compared with prices in the base years, 1982-84 (mean taken as 100). Compute the correlation coefficient between the two indexes and give the Minitab computation of the coefficient.
SS
Sarvesh Somasundaram
Numerade Educator
Fit a least-squares parabola of the form $Y=a_0+a_1 X+a_2 X^2$ to the set of data in Table 14.15.
Check back soon!
Use the least-squares parabola of Problem 14.27 to estimate the values of $Y$ from the given values of $X$.
Alayna Abraham
Numerade Educator
(a) Find the linear correlation coefficient between the variables $X$ and $Y$ of Problem 14.27.
(b) Find the nonlinear correlation coefficient between these variables, assuming the parabolic relationship obtained in Problem 14.27.
(c) Explain the difference between the correlation coefficients obtained in parts (a) and (b).
(d) What percentage of the total variation remains unexplained by assuming a parabolic relationship between $X$ and $Y$ ?
Nathalie Luna
The University of Texas Rio Grande Valley
Find (a) sy and (b) sy $s_x$ for the data of Problem 14.27.
Check back soon!
A correlation coefficient based on a sample of size 18 was computed to be 0.32 . Can we conclude at significance levels of (a) 0.05 and (b) 0.01 that the corresponding population correlation coefficient differs from zero?
Kratika Bhadauria
Numerade Educator
What is the minimum sample size necessary in order that we may conclude that a correlation coefficient of 0.32 differs significantly from zero at the 0.05 level?
Kratika Bhadauria
Numerade Educator
A correlation coefficient on a sample of size 24 was computed to be $r=0.75$. At the 0.05 significance level, can we reject the hypothesis that the population correlation coefficient is as small as (a) $\rho=0.60$ and (b) $\rho=0.50$ ?
Kratika Bhadauria
Numerade Educator
The correlation coefficient between the final grades in physics and mathematics for a group of 21 students was computed to be 0.80 . Find the $95 \%$ confidence limits for this coefficient.
Khoobchandra Agrawal
Numerade Educator
Two correlation coefficients obtained from samples of size $N_1=28$ and $N_3=35$ were computed to be $r_1=0.50$ and $r_2=0.30$, respectively. Is there a significant difference between the two coefficients at the 0.05 level'?
Kratika Bhadauria
Numerade Educator
In Problem 14.1 we found the regression equation of $Y$ on $X$ to be $Y=35.82+0.476 X$. Test the null hypothesis at the 0,05 significance level that the regression coefficient of the population regression equation is 0.180 versus the alternative hypothesis that the regression coefficient exceeds 0.180 . Perform the test without the aid of computer software as well as with the aid of Minitab computer software.
Check back soon!
Find the $95 \%$ confidence limits for the regression coefficient of Problem 14.36. Set the confidence interval without the aid of any computer software as well as with the aid of Minitab computer software.
Tyler Moulton
Numerade Educator
In Problem 14.I, find the $95 \%$ confidence limits for the beights of sons whose fathers' heights are (a) 65.0 and (b) 70.0 inches. Set the confidence interval without the aid of any computer software as well as with the aid of Minitab computer software.
Check back soon!
In Problem 14.1, find the $95 \%$ confidence limits for the mean heights of sons whose fathers* heights are (a) 65.0 inches and (b) 70.0 inches. Set the confidence interval without the aid of any computer software as well as with the aid of Minitab computer software.
Nick Johnson
Numerade Educator
Table 14.18 shows the first two grades (denoted by $X$ and $Y$, respectively) of 10 students on two short quizzes in biology.
(a) Construct a scatter diagram.
(b) Find the least-squares regression line of $Y$ on $X$.
(c) Find the least-squares regression line of $X$ on $Y$.
(d) Graph the two regression lines of parts (b) and ( $c$ ) on the scatter diagram of part ( $a$ ).
Brandon Cleary
Numerade Educator
Find (a) $s_{y x}$ and (b) $s_x y$ for the data in Table 14.18 .
Check back soon!
Compute (a) the total variation in $Y_{\text {, }}$ (b) the unexplained variation in $Y$, and (c) the explained variation in $Y$ for the data of Problem 14.40 .
Check back soon!
Use the results of Problem 14.42 to find the correlation coefficient between the two sets of quiz grades of Problem 14.40.
Rashmi Sinha
Numerade Educator
(a) Find the correlation coefficient between the two sets of quiz grades in Problem 14.40 by using the product-moment formula, and compare this finding with the result of Problem 14.45.
(b) Obtain the correlation coefficient directly from the slopes of the regression lines of Problem 14.42, parts (b) and (c).
Check back soon!
Find the covariance for the data of Problem 14.40 (a) difectly and (b) by using the formula $s_{X Y}=r s_X s y$ and the result of Problem 14.43 or Problem 14.44.
Check back soon!
Table 14.19 shows the ages $X$ and the systotic blood pressures $Y$ of 12 women.
(a) Find the correlation coefficient between $X$ and $Y$.
(b) Determine the least-squares regression equation of $Y$ on $X$.
(c) Estimate the blood pressure of a woman whose age is 45 years.
Check back soon!
Find the correlation coefficients for the dala of (a) Problem 13.32 and (b) Problem 13.35.
Check back soon!
The correldtion coefficient between two variables $X$ and $Y$ is $r=0.60$. If $s_X=1.50, s_Y=2.00, \bar{X}=10$, and $\bar{Y}=20$, find the equations of the regression lines of (a) $Y$ on $X$ and (b) $X$ on $Y$.
Trent Speier
Numerade Educator
Compute (a) $s_y x$ and $(b) s_y y$ for the data of Problem 14.48.
Check back soon!
If $s_Y=3$ and $s_Y=5$, find $r$.
AG
Ankit Gupta
Numerade Educator
If the correlation coefficient between $X$ and $Y$ is 0.50 , what percentage of the total variation remains unexplained by the regression equation?
Prabhakar Kumar
Numerade Educator
(a) Prove that the equation of the regression line of $Y$ on $X$ can be written
$$
Y-\bar{Y}=\frac{s_{X Y}}{s_X^2}(X-X)
$$
(b) Write the analogous equation for the regression line of $X$ on $Y$.
Check back soon!
(a) Compute the correlation coefficient between the corresponding values of $X$ and $Y$ given in Table 14.20.
(h) Multiply each $X$ value in the table by 2 and add 6 . Multiply each $Y$ value in the table by 3 and subtract 15. Find the correlation coefficient between the two new sets of values, explaining why you do or do not obtain the same result as in part (a).
Jameson Kuper
Numerade Educator
(a) Find the regression equations of $\gamma$ on $X$ for the data considered in Problem 14.53, parts (a) and (b).
(b) Discuss the relationship between these regression equations.
Check back soon!
(a) Prove that the correlation coefficient between $X$ and $Y$ can be written
$$
r=\frac{\overline{X Y}-\overline{X Y}}{\sqrt{\left|\overline{X^2}-X^2\right| Y^2-Y^2 \mid}}
$$
(b) Using this method, work Problem 14.1.
Check back soon!
Prove that a correlation coefficient is independent of the choice of origin of the variables or the units in which they are expressed. (Hint: Assume that $X^{\prime}=c_1 X+A$ and $Y^{\prime}=c_2 Y+B$, where $c_1, c_2, A$, and $B$ are any constants, and prove that the correlation coefficient between $X^{\prime}$ and $Y^{\prime}$ is the same as that between $X$ and $Y_{-}$)
Rashmi Sinha
Numerade Educator
(a) Prove that, for linear regression,
$$
\frac{s_Y^2-x}{s_Y^2}=\frac{s_{X Y}^2}{s_Y^2}
$$
(b) Does the result hold for nonlinear regression?
Check back soon!
Find the correlation coefficient between the heights and weights of the 300 U.S. adult males given in Table 14.21, a frequency table.
Heena Haldankar
Numerade Educator
(a) Find the least-squares regression equation of $Y$ on $X$ for the data of Problem 14.58.
(b) Estimate the weights of two men whose heights are 64 and 72 in , respectively.
Anna Jones
Numerade Educator
Find (a) $s_{Y X}$ and (b) $s_{X . Y}$ for the data of Problem 14.58,
Check back soon!
Establish formula (2l) of this chapter for the correlation coefficient of grouped data.
Check back soon!
Table 14.22 shows the average annual expenditures per consumer unit for health care and the per capita income for the years 1988 through 1995. Find the correlation coefficient.
Tyler Moulton
Numerade Educator
Table 14.23 shows the average temperature and precipitation in a city for the month of July during the years 1989-1998. Find the correlation coefficient.
Nicholas Bondra
Numerade Educator
A correlation coefficient based on a sample of size 27 was computed to be 0.40 , Can we conclude at significance levels of (a) 0.05 and (b) 0.01 , that the corresponding population correlation coefficient differs from zero?
Kratika Bhadauria
Numerade Educator
A correlation coefficient based on a sample of size 35 was computed to be 0.50 . At the 0.05 significance level, can we reject the hypothesis that the population correlation coefficient is (a) as smalt as $\rho=0.30$ and (b) as large as $\rho=0.70$ ?
Kratika Bhadauria
Numerade Educator
Find the ( $a$ ) $95 \%$ and ( $b$ ) $99 \%$ confidence limits for a correlation coefficient that is computed to be 0.60 from a sample of size 28 .
Check back soon!
Work Problem 14.66 if the sample size is 52 .
Hast Aggarwal
Numerade Educator
Find the $95 \%$ confidence limits for the correlation coeflicients computed in (a) Problem 14.46 and ( $b$ ) Problem 14.58.
Check back soon!
Two correlation coefficients obtained from samples of size 23 and 28 were computed to be 0.80 and 0.95 , respectively. Can we conclude at levels of (a) 0.05 and (b) 0.01 that there is a significant difference betwcen the two coefficients?
Kratika Bhadauria
Numerade Educator
On the basis of a sample of size 27 , a regression equation of $Y$ on $X$ was found to be $Y=25.0+2.00 X$. If $s_Y=1.50 . s_x=3.00$, and $\bar{X}-7.50$, find the $(a) 95 \%$ and $(b) 99 \%$ confidence limits for the regression coefficient,
Check back soon!
In Problem 14.70, test the hypothesis that the population regression coefficient at the 0.01 significance level is (a) as low as 1.70 and ( $b$ ) as high as 2.20 .
Victor Salazar
Numerade Educator
In Problem 14.70, find the (a) $95 \%$ and (b) $99 \%$ contidence limits for $Y$ when $X=6.00$.
Check back soon!
In Problem 14.70, find the (a) $95 \%$ and (b) $99 \%$ contidence limits for the mean of all values of $\gamma$ corresponding to $X=6.00$.
Manik Pulyani
Numerade Educator
Referring to Problem 14.46, find the $95 \%$ contidence limits for (a) the regression coefficient of $Y$ on $X$. (b) the blood pressures of all women who are 45 years old. and (c) the mean of the blood pressures of all women who are 45 ycars old.
Sheryl Ezze
Numerade Educator