Preface
Machine learning (ML) is an integral part of our day to day, whether we’re aware of it or not. Each time you go on sites like YouTube and Amazon.com, you’re interacting with ML, which powers personalized recommendations. This means that the way the products are displayed on the sites is based on what ML algorithms think suit your taste and interests. And not just that—there’s ML-based comment moderation to flag spam or toxic comments, review moderation, and more. On sites like YouTube, there are ML-generated captions and translations.
ML is also present in aspects of our lives beyond shopping and entertainment. For example, when you send a money transfer online, ML algorithms are checking to see whether it’s fraudulent. We live in an age of software that is built on a foundation of data and ML algorithms.
All of this software requires specialized talent to design and build, which has created a demand for software skills and has elevated ML careers in recent years. The pay for technology roles has also risen as a result. These are just some of the many factors that make an ML career enticing: building the products and product features that are so integral to our lives. Since ML techniques power AI advancements, this discussion similarly applies to “AI careers.”
Entering the ML field is challenging, however. ML jobs have a reputation for requiring higher academic credentials, with most of the jobs in the 2010s requiring a PhD. Even if the credential requirements on job postings have decreased since the late 2010s, the advice I still commonly see online is to have at least a master’s degree. Even those with ample credentials can struggle to find a role in the data and ML fields. Is the advice given online wrong, or is it too generalized and vague?
I’ve interviewed for numerous ML jobs, and I’ve been successful at entry level, senior level, and the staff+1 and principal2 levels. Throughout the process, I’ve experienced firsthand the same difficulties and frustrations that aspiring candidates encounter during ML interviews. I’ve sent out endless resumes only to get no replies. I’ve failed phone screens, suffered the anxiety of waiting for responses, and even failed an on-site after they’d flown me to San Francisco from Toronto. I’ve applied for data scientist and machine learning engineer (MLE) jobs only to be confused when the interviewers seemed to be looking more for a data engineer or data analyst.
Apart from my experience as an interviewee, I’ve built years of experience as an interviewer. As part of my jobs in the ML field, I’ve reviewed and filtered hundreds of resumes, conducted numerous interviews, and served on many decision-making committees. As part of technical leadership (principal level at two companies), I’ve reviewed job descriptions and interviewed co-ops, interns, and entry-level candidates as well as senior and staff+ hires. I’ve included tips in this book based on mistakes made by job candidates that resulted in my fellow interviewers and me deciding not to pass them on to the next round. “If only the candidate had done this other thing,” we said. “They were quite promising otherwise.” This book will help you avoid some of these obvious mistakes.
The truth is that there are a lot of unspoken criteria for job seekers. For example, having good communication and teamwork skills may not be included in some job descriptions. Expectations such as these aren’t omitted from job descriptions because of malice but because those in the industry see them as minimum requirements. I have more recently seen ML job postings from major companies clearly list “communication skills” at the very top of their lists of requirements in an attempt to improve the clarity of job descriptions.
In addition to these hidden expectations for new and experienced job seekers alike, the interview process can be confusing because it differs so much from role to role and from company to company. Even Randy Au, a writer who’s worked in data at Google for years, said that “things are … different”3 when, out of curiosity, he looked at current data scientist and ML job postings.
Many people wish for a roadmap, a full step-by-step for how to enter the ML field, guaranteed. For example, what are the best university majors and internships? What are the best side projects, and what Python libraries should you learn? I can relate to this—I’ve asked many friends for as much information as possible along each step of my job interview journey. I worried about whether I should send a follow-up email after an interview and looked in multiple online forums to see if I should. Would I annoy the interviewers, or would they be expecting it? Such a small thing caused me a lot of anxiety, and I wished there were just a clear answer instead of “it depends” or “it probably won’t hurt.” This is the book I wish I’d had back then to refer to for all those questions!
Now that I’ve been on the other side as an interviewer, I’ve learned what the hiring side prefers in job candidates in various scenarios. I now have firsthand answers to many questions I had in the past, and more of a roadmap to entering the ML field. Although even if there were such a guaranteed roadmap, it won’t be the one you are imagining. By the time I learned about the ML and data science fields, I had long ago chosen my university major, graduated, and was partway through a master’s degree in economics. I didn’t have any internships during university; instead, I made and played video games and socialized in my spare time. If anything, the roadmap to an ML job is quite flexible, and even if you start a bit later, there is no such thing as being too late.
When I was searching for my first ML job, I didn’t do all the most straightforward things, but I was somehow able to make my way through job interviews as a student who had never done an internship. I probably knew less about the interview process than many people did, but that’s why I’ve been able to write from a perspective of someone who didn’t do all the right things and was still able to thrive in the ML field. Indeed, there are no right things, only the things that are right for your situation.
I won’t tell you things like, “Just major in [SUBJECT] at your university and then get an internship at [COMPANY], and you’ll be set.” I’d need to write a separate book for each different type of person. A one-size-fits-all, prescriptive roadmap will fail when you encounter a point not already on the map. If you learn how to navigate without being glued to a map, you can create your own maps, regardless of the situation.
In this book, I’ll show you how to be a navigator and create your own roadmap, whether you are a non-STEM4 major, a STEM major without internship experience, have no relevant work experience, have ML work experience or non-ML work experience, and so on. As long as you stick with it, it’ll be fine if you majored in something that’s not often recommended. It’s OK if you have previous job experience that you don’t think is directly relevant to ML. I’ll walk you through how to enhance and make use of your past experiences as well as how to gain additional relevant experience.
I advocate for flexible and tailored career roadmaps based on your own scenario because in my own career I’ve encountered many scenarios in which there wasn’t one single roadmap:
Landing an entry-level data scientist (ML) job as an economics master’s degree student at a large, public company5
Landing a job with a more senior role at a startup with about 200 employees when I joined, and about 400 employees at peak
Landing a job at a new, mid-large public company as a principal data scientist
Depending on the industry, the company size, the ML team size, and the company’s lifecycle stage (e.g., startup), employers had different expectations that I needed to learn about. If I had only followed online advice or advice from people who interviewed at companies that used a different job-interview process, I might have failed (no, I would have failed). Each time, I’ve had to change up the way I prepare and the way I interview in order to succeed. Through all my personal experiences and (literally) hundreds of ML interviews, I’ve found patterns for how to ace ML and data science job interviews and be a successful candidate. With my experiences and the lessons I’ve learned, it’s now possible to write this book to help aspiring job candidates.
Successful job candidates know what each step in the interview process is trying to assess in their scenario. Unfortunately, simply showing up and having the technical skills isn’t always enough. It’s like exams at school—people who look at the syllabus carefully and understand the scope of each exam are more likely to succeed. In this case, you try to reverse engineer a syllabus for each of the jobs you are applying for.
As I gained more and more experience in ML, I also got more and more questions from aspiring job seekers. I’ve taken on many coffee chats (100+ at this point) and to help even more people, I’ve written career guides for my blog susanshu.com for years. When the opportunity to help even more people with this book came up, the decision for me was clear.
Why Machine Learning Jobs?
I’ve spoken about how ML is prevalent in our day-to-day lives, whether we know it or not, and whether we like it or not. You may have had some experiences in your own life that caused you to become curious and pick up this book! I’ll also outline my experiences, which may reinforce your motivations or bring even more attractive aspects of the ML field to your attention.
As someone working in tech, I think ML is a great area to develop high-value products that can affect millions of users. I had the chance to work on such a project in my very first job out of school, and I think that I might not have had that responsibility and opportunity so early in my career if I hadn’t been skilled in machine learning.
In my opinion, ML is a fun and fulfilling area. I enjoy learning about new technologies and research, and if you relate to that, you’ll enjoy that facet of working in ML too. There is a flip side to the fast-paced innovations in our field. For example, it can be exhausting to continuously learn about new advancements when trying to focus on family or other important aspects of our lives. Nowadays, even if I’m very focused on other activities such as socializing or writing this book on the weekends, I still take the opportunity to learn without spending too much time. I also take some time during work hours to listen to talks online or read books. This isn’t exclusive to ML, but I’ve heard from many people that the pace of continuous learning for ML is a bit faster than for other tech-related jobs that require learning new frameworks.
Of course, there is also the aspect of pay. On average, ML jobs are well compensated. I’ve been able to provide for myself and even accomplish many financial goals that enhance my life and the lives of my loved ones. This is something I’m very grateful to my ML career for enabling. On another note, I’ve been able to achieve so much because of the ML field and community: I’ve been flown around the world to speak at conferences (so many of them that I’ve had to rain check for future years). Meeting cool people working at cool places in ML and seeing advancements in the ML and AI space firsthand are all perks of working in this industry.
No matter what your motivation for picking up this book is, I hope that I can successfully share with you the skills and tools for you to succeed at ML job interviews and to overcome roadblocks along the way.
In this book I’ll help you understand the following:
The various types of ML roles and which ones you’d be most likely to succeed at
The building blocks of ML interviews
How to identify your skill gaps and target your interview preparation efficiently
How to succeed at both technical and behavioral interviews
I’ll be adding commonly asked questions from the online live training I’ve taught at O’Reilly as well. Consider it a coffee chat with me and the various sources I’ve gained supporting insights from:
How to succeed as a candidate with a less “typical” educational or career background
How to greatly increase the chances that your resume will clear the initial screening
What ML interviews for senior and higher roles look like
And more.
Who This Book Is For
Before I dive into the chapters, I want to outline the following scenarios that you might find relatable; this is the audience I’ve written this book for:
You are a recent graduate who is eager to become an ML/AI practitioner in industry.
You are a software engineer, data analyst, or other tech/data professional who is transitioning into a role that focuses on ML day to day.
You are a professional with experience in another field who is interested in transitioning into the ML field.
You are an experienced data scientist or ML practitioner who is returning to the interviewing fray and aiming for a different role or an increased title and responsibility, and you would like a comprehensive refresher of ML material.
You could also benefit from this book if the following scenarios describe you:
Managers who want to get inspiration for how to conduct their ML interviews or nontechnical people who want to get an overview of the process without wasting too much time on scattered online resources
Readers who have a basic knowledge of Python programming and ML theory and are curious to explore if entering the ML field could be a future career choice
What This Book Is Not
This book is not a statistics or ML textbook.
This book is not a coding textbook or tutorial book.
While there are sample interview questions, this book is not a question bank. Code snippets will be brief and concise since they become outdated quickly.
Since I can’t cover every concept from scratch, I assume that readers have a rudimentary familiarity with ML (a high-level understanding is enough). But don’t worry, as I will cover the basic definitions as a quick reminder. I also assume the audience has some familiarity with the Python programming language, such as running scripts on Jupyter Notebooks, since Python is popular in ML interviews and on the job. However, I do include a brief section on learning Python from scratch if you happen to not be familiar with it.
In addition, this book provides a substantial library of links to external practice resources to help you with preparing for ML interviews; but first, I’ll help you identify what is most helpful for you to practice and learn beyond your current knowledge and skill level.
Thus, instead of listing a bunch of questions and answers to memorize, with this book I’m aiming to teach you how to fish. As an interviewer, many candidates I’ve seen who didn’t pass the interview wouldn’t have been saved if they had just practiced some more questions. Rather, they didn’t even know what their gaps were. I’ll teach you how to identify your strengths and gaps and how exactly you can use the resources in this book to close those gaps.
Conventions Used in This Book
The following typographical conventions are used in this book:
- Italic
-
Indicates new terms, URLs, email addresses, filenames, and file extensions.
Constant width
-
Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.
Constant width bold
-
Shows commands or other text that should be typed literally by the user.
Constant width italic
-
Shows text that should be replaced with user-supplied values or by values determined by context.
Tip
This element signifies a tip or suggestion.
Note
This element signifies a general note.
Warning
This element indicates a warning or caution.
O’Reilly Online Learning
Note
For more than 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.
Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit https://oreilly.com.
How to Contact Us
Please address comments and questions concerning this book to the publisher:
- O’Reilly Media, Inc.
- 1005 Gravenstein Highway North
- Sebastopol, CA 95472
- 800-889-8969 (in the United States or Canada)
- 707-829-7019 (international or local)
- 707-829-0104 (fax)
- support@oreilly.com
- https://www.oreilly.com/about/contact.html
We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/ML-interviews.
There is a supplemental website that includes bonus content not found in the book: https://susanshu.substack.com.
For news and information about our books and courses, visit https://oreilly.com.
Find us on LinkedIn: https://linkedin.com/company/oreilly-media
Follow us on Twitter: https://twitter.com/oreillymedia
Watch us on YouTube: https://youtube.com/oreillymedia
Acknowledgments
Many people’s wisdom and encouragement helped make this book become a reality!
A big thanks to the reviewers who read and reviewed this book as a work in progress: Margaret Maynard-Reid, Serena McDonnell, Dominic Monn, and Suhas Pai. All your early feedback and comments have helped make this book so much better. I’d also like to thank folks who reviewed select chapters: Eugene Yan, Prithvishankar Srinivasan, Ammar Asmro, Luis Duque, Igor Ilic, Jeremy C., and Masoud H. I’m super grateful and humbled by how you all graciously took the time to look over the book at varying stages of completion and answer my questions!
Everything I have written in this book has been accumulated throughout my career, so I’d be remiss if I didn’t mention the friends, mentors, and organizations that have greatly affected my career who haven’t already been named: Nick Miles, Denis Osipov, Amir Feizpour, Shannon Elliott, the Python Software Foundation, PyCons around the world, and many more amazing colleagues in the teams I’ve worked with. You rock!
Of course, big thanks to the O’Reilly team: the awesome development editor Sara Hunter, who’s been encouraging me and helping me stay focused through this intense year of writing; production editor Elizabeth Kelly; copy editor Shannon Turlington; and acquisitions editor Nicole Butterfield, who gave me a lot of encouragement and originally reached out about teaching the O’Reilly “Machine Learning Interviews” online training—which eventually kicked off this book!
Thank you to my loved ones who have supported me through thick and thin, no matter what random adventure I’m on. My family, who has always been there for me, and even more so during the writing of this book, is a source of motivation and inspiration. Thank you to mom, dad, my brother, and my grandparents. Thank you, Susan, for being my sun. Thank you to my friends who have been my support system and warmth since university days.
Lastly, I’d like to thank my instructors, peers and lifelong friends at the University of Waterloo and the University of Toronto for fostering an inspiring, rigorous, yet flexible environment for me to explore my curiosity and interests. This freedom led me to my career in machine learning, and I wouldn’t be here without that environment in which to chase my dreams.
1 Staff+ refers to roles that are above the senior level.
2 The job levels in tech often progress from entry/intermediate level → senior level → staff level → principal level, although there are minor differences depending on the company. For example, some companies combine the staff and principal levels.
3 Randy Au, “Old Dog Revisits the DS Job Market out of Curiosity,” Counting Stuff (blog), December 1, 2022, https://oreil.ly/yzIsx.
4 Science, technology, engineering, and mathematics.
5 A public company means it has publicly traded stocks.
Get Machine Learning Interviews now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.