Radar / AI & ML

Simplifying machine learning lifecycle management

The O’Reilly Data Show Podcast: Harish Doddi on accelerating the path from prototype to production.

By Ben Lorica

August 16, 2018

AEDC’s 16-foot supersonic wind tunnel test facility (source: Phil Tarver, Arnold Engineering Development Center, U.S. Air Force on Wikimedia Commons)

Simplifying machine learning lifecycle management
Data Show Podcast

00:00 / 00:37:25

In this episode of the Data Show, I spoke with Harish Doddi, co-founder and CEO of Datatron, a startup focused on helping companies deploy and manage machine learning models. As companies move from machine learning prototypes to products and services, tools and best practices for productionizing and managing models are just starting to emerge. Today’s data science and data engineering teams work with a variety of machine learning libraries, data ingestion, and data storage technologies. Risk and compliance considerations mean that the ability to reproduce machine learning workflows is essential to meet audits in certain application domains. And as data science and data engineering teams continue to expand, tools need to enable and facilitate collaboration.

As someone who specializes in helping teams turn machine learning prototypes into production-ready services, I wanted to hear what Doddi has learned while working with organizations that aspire to “become machine learning companies.”

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

Here are some highlights from our conversation:

A central platform for building, deploying, and managing machine learning models

In one of the companies where I worked, we had built infrastructure related to Spark. We were a heavy Spark shop. So we built everything around Spark and other components. But later, when that organization grew, a lot of people came from a TensorFlow background. That suddenly created a little bit of frustration in the team because everybody wanted to move to TensorFlow. But we had invested a lot of time, effort and energy in building the infrastructure for Spark.

… We suddenly had hidden technical debt that needed to be addressed. … Let’s say right now you have two models running in production and you know that in the next two or three years you are going to deploy 20 to 30 models. You need to start thinking about this ahead of time.

… That’s why these days I observed that organizations are creating centralized teams. The centralized team is responsible for maintaining flexible machine learning infrastructure that can be used to deploy, operate, and monitor many models simultaneously.

Feature store: Create, manage, and share canonical features

When I talk to companies these days, everybody knows that their data scientists are duplicating work because they don’t have a centralized feature store. Everybody I talk to really wants to build or even buy a feature store, depending on what is easiest for them.

… The number of data scientists within most companies is increasing. And one of the pain points I’ve observed is when a new data scientist joins an organization, there is an extreme amount of ramp-up period. A new data scientist needs to figure out what the data sets are, what the features are, so on and so forth. But if an organization had a feature store, the ramp-up period can be much faster.

Related resources:

“Lessons learned turning machine learning models into real products and services”
“What are machine learning engineers?”: examining a new role focused on creating data products and making data science work in production
“MLflow: A Platform for Managing the Machine Learning Lifecycle”
“Managing risk in machine learning models”: Andrew Burt and Steven Touw on how companies can manage models they cannot fully explain
“We need to build machine learning tools to augment machine learning engineers”
When models go rogue: David Talby on hard-earned lessons about using machine learning in production

Post topics: AI & ML, Data, O'Reilly Data Show Podcast

Post tags: Podcast

Simplifying machine learning lifecycle management

Simplifying machine learning lifecycle managementData Show Podcast

Learn faster. Dig deeper. See farther.

A central platform for building, deploying, and managing machine learning models

Feature store: Create, manage, and share canonical features

Get the O’Reilly Radar Trends to Watch newsletter

Thank you for subscribing.

Simplifying machine learning lifecycle management
Data Show Podcast