Chapter 2. End-to-End Machine Learning Project
In this chapter you will work through an example project end to end, pretending to be a recently hired data scientist at a real estate company. This example is fictitious; the goal is to illustrate the main steps of a machine learning project, not to learn anything about the real estate business. Here are the main steps we will walk through:
-
Look at the big picture.
-
Get the data.
-
Explore and visualize the data to gain insights.
-
Prepare the data for machine learning algorithms.
-
Select a model and train it.
-
Fine-tune your model.
-
Present your solution.
-
Launch, monitor, and maintain your system.
Working with Real Data
When you are learning about machine learning, it is best to experiment with real-world data, not artificial datasets. Fortunately, there are thousands of open datasets to choose from, ranging across all sorts of domains. Here are a few places you can look to get data:
-
Popular open data repositories:
-
Meta portals (they list open data repositories):
-
Other pages listing many popular open data repositories:
In this chapter we’ll use the California Housing Prices dataset from the StatLib repository1 (see Figure 2-1). This dataset is based on ...
Get Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.