CHAPTER 8Modeling Relationships Between Two Variables

In many analytics applications, we need to try to determine how two variables are related. This chapter is a primer on how analysts can determine the relationship between two variables.

Examples of Relationships Between Two Variables

Often, we want to predict a dependent variable (call it Y) from an independent variable (call it X). Table 8.1 lists some examples of business relationships you might want to estimate.

Table 8.1: Examples of relationships between two variables

X (INDEPENDENT VARIABLE) Y (DEPENDENT VARIABLE)
Units produced by a plant in a month Monthly cost of operating plant
Monthly dollars spent on advertising Monthly sales
Number of employees Monthly travel expenses
Company annual revenue Number of employees
Monthly return on the stock market Monthly return on a mutual fund or stock
Square feet in home Home price
Price of product Units sold of product

Finding the Best-Fitting (Least Squares) Line

The first step in determining how two variables are related is to create a scatterplot graph where each data point, the value of X is on the horizontal axis, and the value of Y is on the vertical axis. If your graph indicates that a straight line is a reasonable fit to the data, you can use the Excel Trendline feature (or Excel functions) to find the straight line that best fits the points. The “Excel Calculations” section at the end of the chapter describes how to find the straight ...

Get Analytics Stories now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.