Chapter 17. Model Deploy Service
In the journey of deploying insights in production, we have optimized the processing queries and orchestrated the job pipelines. Now we are ready to deploy ML models in production and periodically update them based on retraining.
Several pain points slow down time to deploy. The first is nonstandardized homegrown scripts for deploying models that need to support a range of ML model types, ML libraries and tools, model formats, and deployment endpoints (such as Internet of Things [IoT] devices, mobile, browser, and web API). The second pain point is that, once deployed, there are no standardized frameworks to monitor the performance of models. Given multitenant environments for model hosting, monitoring ensures automatic scaling of the models and performance isolation from other models. The third pain point is ensuring the prediction accuracy of the models with drifts in data distributions over time. Time to deploy impacts the overall time to insight during the initial model deployment and on an ongoing basis during monitoring and upgrading. Data users need to depend on data engineering to manage deployment, which slows down the overall time to insight even further.
Ideally, a self-service model deploy service should be available to deploy trained models from any ML library into any model format for deployment at any endpoint. Once deployed, the service automatically scales the model deployment. For upgrades to existing models, it supports canary ...
Get The Self-Service Data Roadmap now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.