Book description
This practical guide provides a collection of techniques and best practices that are generally overlooked in most data engineering and data science pedagogy. A common misconception is that great data scientists are experts in the "big themes" of the discipline—machine learning and programming. But most of the time, these tools can only take us so far. In practice, the smaller tools and skills really separate a great data scientist from a not-so-great one.
Taken as a whole, the lessons in this book make the difference between an average data scientist candidate and a qualified data scientist working in the field. Author Daniel Vaughan has collected, extended, and used these skills to create value and train data scientists from different companies and industries.
With this book, you will:
- Understand how data science creates value
- Deliver compelling narratives to sell your data science project
- Build a business case using unit economics principles
- Create new features for a ML model using storytelling
- Learn how to decompose KPIs
- Perform growth decompositions to find root causes for changes in a metric
Daniel Vaughan is head of data at Clip, the leading paytech company in Mexico. He's the author of Analytical Skills for AI and Data Science (O'Reilly).
Publisher resources
Table of contents
- Preface
- I. Data Analytics Techniques
- 1. So What? Creating Value with Data Science
- 2. Metrics Design
- 3. Growth Decompositions: Understanding Tailwinds and Headwinds
- 4. 2×2 Designs
- 5. Building Business Cases
- 6. What’s in a Lift?
- 7. Narratives
- 8. Datavis: Choosing the Right Plot to Deliver a Message
- II. Machine Learning
- 9. Simulation and Bootstrapping
- 10. Linear Regression: Going Back to Basics
- 11. Data Leakage
- 12. Productionizing Models
- 13. Storytelling in Machine Learning
- 14. From Prediction to Decisions
- 15. Incrementality: The Holy Grail of Data Science?
- 16. A/B Tests
- 17. Large Language Models and the Practice of Data Science
- Index
- About the Author
Product information
- Title: Data Science: The Hard Parts
- Author(s):
- Release date: November 2023
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098146474
You might also like
book
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
book
Practical Statistics for Data Scientists, 2nd Edition
Statistical methods are a key part of data science, yet few data scientists have formal statistical …
book
Python Data Science Handbook, 2nd Edition
Python is a first-class tool for many researchers, primarily because of its libraries for storing, manipulating, …
book
Architecting Data and Machine Learning Platforms
All cloud architects need to know how to build data platforms that enable businesses to make …