Chapter 2. Data Is the First Step
This chapter provides an overview of the use cases and datasets used in the book while also providing information on where to find data sources for further study and practice. You’ll also learn about data types, and the difference between batch and streaming data. You’ll get hands-on practice with data preprocessing using Google’s free browser-based open source Jupyter Notebook. The chapter concludes with a section on using GitHub to create a data repository for the selected projects used in the book.
Overview of Use Cases and Datasets Used in the Book
Hopefully, you picked up our book to learn ML not from a math-first or algorithm-first approach but from a project-based approach. The use cases we’ve chosen are designed to teach you ML using actual, real-world data across different sectors. There are use cases for healthcare, retail, energy, telecommunications, and finance. The use case on customer churn can be applied to any sector. Each of the use case projects can stand on its own if you have some data preprocessing experience, so feel free to skip ahead to what you need to learn to upskill yourself. Table 2-1 shows each section, its use case, sector, and whether it is no-code or low-code.
Section | Use case | Sector | Type |
---|---|---|---|
1 | Product pricing | Retail | N/A |
2 | Heart disease | Healthcare | Low-code data preprocessing |
3 | Marketing campaign | Energy | No-code (AutoML) |
4 | Advertising media ... |
Get Low-Code AI now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.