Fundamentals of Machine Learning for Software Engineers
Published by O'Reilly Media, Inc.
Understanding and integrating models into software
Course outcomes:
- Prepare data and select models for classical machine learning problems
- Chart a path to production for machine learning models
- Anticipate and analyze mistakes produced by machine learning models
Course description:
Machine learning models have made their way into production across software engineering projects in a big way. There are many ways for software engineers to identify and participate in opportunities to leverage machine learning across an organization—and in many cases, at least as much of a place for the software engineering skill set as for a specific machine learning engineering skill set.
Join expert Chelsea Troy to follow the steps from ideation to productionization of the types of machine learning models most often developed in-house and to understand the challenges of building from scratch. You’ll consider machine learning models as external libraries, learn how to use them, and learn what to be aware of when integrating with them. You’ll finish up with a conversation about the role of generative AI, and specifically LLMs, in software engineering.
NOTE: With today’s registration, you’ll be signed up for both sessions. Although you can attend any of the sessions individually, we recommend participating in both.
What you’ll learn and how you can apply it
- Understand the value of the software engineering skill set for using machine learning
- Learn the most important production concerns for ML models that engineers can find, flag, and fix
- Take advantage of machine learning libraries for use in your organization
This live event is for you because...
- You want to use your software engineering skills to participate in machine learning efforts.
Prerequisites
- Some existing software engineering skill is helpful
Recommended follow-up:
- Read Applied Machine Learning and AI for Engineers (book)
- Take Machine Learning from Scratch (live online course with Thomas Nield)
- Take Artificial intelligence: An Overview of AI and Machine Learning (live online course with Alex Castrounis)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Day 1
Part I (15 minutes)
- Presentation: How machine learning models are changing software engineering
Part II (35 minutes)
- Presentation: Preparing data for ML applications (dealing with missing values, categorical data and continuous numerical data); identifying data that won’t generalize
- Group discussion: Preparing a sample table from a medical professional specialty categorization problem for a machine learning application
- Q&A
- Break
Part III (45 minutes)
- Presentation: Splitting data (balancing classes, train-validation-test split, data leakage); identifying opportunities to incorporate classical ML in engineering flows
- Hands-on exercise: Split and prepare medical categorization data
- Group discussion: Which types of model might you choose to try?
- Q&A
- Break
Part IV (35 minutes)
- Presentation: Selecting a model (classical machine learning models)
- Group discussion: For each of these three problems, which types of model would you try?
- Q&A
- Break
Part V (50 minutes)
- Presentation: Ensembling models; choosing a model (accuracy versus precision versus recall versus F1 score); tuning hyperparameters; error analysis
- Group discussion: Which types of models would you choose?
- Q&A
Day 2
Part I (40 minutes)
- Presentation: Machine learning models as helper libraries; integration types
- Hands-on exercise: Explore spaCy integration
- Q&A
Part II (50 minutes)
- Presentation: Criteria for selecting model libraries for integration; evaluating some examples on the criteria
- Group discussion: What sort of application would you like to integrate with external ML models, and what sort of external model would you use?
- Q&A
Part III (50 minutes)
- Presentation: Integrating safely and effectively; risks of integration; mitigating risks; integration tests; feedback loop for evaluating results; options for data validation
- Hands-on exercise: Make spaCy integration more production-ready
- Q&A
- Break
Part IV (40 minutes)
- Presentation: Role of LLMs and generative AI in software engineering
- Hands-on exercise: Identify the issue in each of these examples from LLMs working on code
- Q&A
Your Instructor
Chelsea Troy
Chelsea Troy leads the machine learning operations team at Mozilla. She also teaches in the Master’s Program in Computer Science at the University of Chicago. Her online workshop, Fundamentals of Technical Debt, is available On Demand through the O’Reilly platform, and she also gives live courses about machine learning, large language models, and product thinking.