3 Exploring data
This chapter covers
- Using summary statistics to explore data
- Exploring data using visualization
- Finding problems and issues during data exploration
In the last two chapters, you learned how to set the scope and goal of a data science project, and how to start working with your data in R. In this chapter, you’ll start to get your hands into the data. As shown in the mental model (figure 3.1), this chapter emphasizes the science of exploring the data, prior to the model-building step. Your goal is to have data that is as clean and useful as possible.
Example Suppose your goal is to build a model to predict which of your customers don’t have health insurance. You’ve collected a dataset of customers whose health insurance status ...
Get Practical Data Science with R, Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.