Chapter 8. Considerations When Deploying Models
The previous chapters covered model training and generalization performance. These are necessary steps to deploy a model, but they are not sufficient to guarantee the success of an ML-powered product.
Deploying a model requires a deeper dive into failure modes that could impact users. When building products that learn from data, here are a few questions you should answer:
-
How was the data you are using collected?
-
What assumptions is your model making by learning from this dataset?
-
Is this dataset representative enough to produce a useful model?
-
How could the results of your work be misused?
-
What is the intended use and scope of your model?
The field of data ethics aims to answer some of these questions, and the methods used are constantly evolving. If you’d like to dive deeper, O’Reilly has a comprehensive report on the subject, Ethics and Data Science, by Mike Loukides et al.
In this chapter, we will discuss some concerns around data collection and usage and the challenges involved with making sure models keep working well for everyone. We will conclude the section with a practical interview covering tips to translate model predictions to user feedback.
Let’s start by looking at data, first covering ownership concerns and then moving on to bias.
Data Concerns
In this section, we will start by outlining tips to keep in mind when you store, use, and generate data. We will start by covering data ownership and the responsibilities ...
Get Building Machine Learning Powered Applications now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.