Skip to content
  • Sign In
  • Try Now
View all events
Machine Learning

Fundamentals of Machine Learning for Software Engineers

Published by O'Reilly Media, Inc.

Beginner content levelBeginner

Understanding and integrating models into software

Course outcomes:

  • Prepare data and select models for classical machine learning problems
  • Chart a path to production for machine learning models
  • Anticipate and analyze mistakes produced by machine learning models

Course description:

Machine learning models have made their way into production across software engineering projects in a big way. There are many ways for software engineers to identify and participate in opportunities to leverage machine learning across an organization—and in many cases, at least as much of a place for the software engineering skill set as for a specific machine learning engineering skill set.

Join expert Chelsea Troy to follow the steps from ideation to productionization of the types of machine learning models most often developed in-house and to understand the challenges of building from scratch. You’ll consider machine learning models as external libraries, learn how to use them, and learn what to be aware of when integrating with them. You’ll finish up with a conversation about the role of generative AI, and specifically LLMs, in software engineering.

NOTE: With today’s registration, you’ll be signed up for both sessions. Although you can attend any of the sessions individually, we recommend participating in both.

What you’ll learn and how you can apply it

  • Understand the value of the software engineering skill set for using machine learning
  • Learn the most important production concerns for ML models that engineers can find, flag, and fix
  • Take advantage of machine learning libraries for use in your organization

This live event is for you because...

  • You want to use your software engineering skills to participate in machine learning efforts.

Prerequisites

  • Some existing software engineering skill is helpful

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Day 1

Part I (15 minutes)

  • Presentation: How machine learning models are changing software engineering

Part II (35 minutes)

  • Presentation: Preparing data for ML applications (dealing with missing values, categorical data and continuous numerical data); identifying data that won’t generalize
  • Group discussion: Preparing a sample table from a medical professional specialty categorization problem for a machine learning application
  • Q&A
  • Break

Part III (45 minutes)

  • Presentation: Splitting data (balancing classes, train-validation-test split, data leakage); identifying opportunities to incorporate classical ML in engineering flows
  • Hands-on exercise: Split and prepare medical categorization data
  • Group discussion: Which types of model might you choose to try?
  • Q&A
  • Break

Part IV (35 minutes)

  • Presentation: Selecting a model (classical machine learning models)
  • Group discussion: For each of these three problems, which types of model would you try?
  • Q&A
  • Break

Part V (50 minutes)

  • Presentation: Ensembling models; choosing a model (accuracy versus precision versus recall versus F1 score); tuning hyperparameters; error analysis
  • Group discussion: Which types of models would you choose?
  • Q&A

Day 2

Part I (40 minutes)

  • Presentation: Machine learning models as helper libraries; integration types
  • Hands-on exercise: Explore spaCy integration
  • Q&A

Part II (50 minutes)

  • Presentation: Criteria for selecting model libraries for integration; evaluating some examples on the criteria
  • Group discussion: What sort of application would you like to integrate with external ML models, and what sort of external model would you use?
  • Q&A

Part III (50 minutes)

  • Presentation: Integrating safely and effectively; risks of integration; mitigating risks; integration tests; feedback loop for evaluating results; options for data validation
  • Hands-on exercise: Make spaCy integration more production-ready
  • Q&A
  • Break

Part IV (40 minutes)

  • Presentation: Role of LLMs and generative AI in software engineering
  • Hands-on exercise: Identify the issue in each of these examples from LLMs working on code
  • Q&A

Your Instructor

  • Chelsea Troy

    Chelsea Troy leads the machine learning operations team at Mozilla. She also teaches in the Master’s Program in Computer Science at the University of Chicago. Her online workshop, Fundamentals of Technical Debt, is available On Demand through the O’Reilly platform, and she also gives live courses about machine learning, large language models, and product thinking.

    linkedinXlinksearch