6.4 Summarizing Association with a Line

Correlation measures the strength of linear association between two variables. The larger |r| becomes, the more closely the data cluster along a line. We can use r to find the equation of this line. Once we have this equation, it is easy to predict the response from the explanatory variable.

The simplest expression for this equation uses z-scores. A z-score (Chapter  4) is a deviation from the mean divided by the standard deviation. The correlation converts a z-score of one variable (say, heating degree days) into a z-score of the other (gas use). If we know that a home is, for instance, located in a climate 1 SD above the mean of HDD, then we expect to find its use of natural gas r SDs above the mean of ...

Get Statistics for Business: Decision Making and Analysis, 3rd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.