Chapter 2. Regression Models

You learned in Chapter 1 that supervised learning models come in two varieties: regression models and classification models. You also learned that regression models predict numeric outcomes, such as the price that a home will sell for or the number of visitors a website will attract. Regression modeling is a vital and sometimes underappreciated aspect of machine learning. Retailers use it to forecast demand. Banks use it to screen loan applications, factoring in variables such as credit scores, debt-to-income ratios, and loan-to-value ratios. Insurance companies use it to set premiums. Whenever you need numerical predictions, regression modeling is the right tool for the job.

When building a regression model, the first and most important decision you make is what learning algorithm to use. Chapter 1 presented a simple three-class classification model that used the k-nearest neighbors learning algorithm to identify a species of iris given the flower’s sepal and petal measurements. k-nearest neighbors can be used for regression too, but it’s one of many you can choose from for making numerical predictions. Other learning algorithms frequently produce more accurate models.

This chapter introduces common regression algorithms, many of which can be used for classification also, and guides you through the process of building a regression model that predicts taxi fares using data published by the New York City Taxi and Limousine Commission. It also describes ...

Get Applied Machine Learning and AI for Engineers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.