Experimentation for Engineers

Book description

Optimize the performance of your systems with practical experiments used by engineers in the world’s most competitive industries.

In Experimentation for Engineers: From A/B testing to Bayesian optimization you will learn how to:

  • Design, run, and analyze an A/B test
  • Break the "feedback loops" caused by periodic retraining of ML models
  • Increase experimentation rate with multi-armed bandits
  • Tune multiple parameters experimentally with Bayesian optimization
  • Clearly define business metrics used for decision-making
  • Identify and avoid the common pitfalls of experimentation

Experimentation for Engineers: From A/B testing to Bayesian optimization is a toolbox of techniques for evaluating new features and fine-tuning parameters. You’ll start with a deep dive into methods like A/B testing, and then graduate to advanced techniques used to measure performance in industries such as finance and social media. Learn how to evaluate the changes you make to your system and ensure that your testing doesn’t undermine revenue or other business metrics. By the time you’re done, you’ll be able to seamlessly deploy experiments in production while avoiding common pitfalls.

About the Technology
Does my software really work? Did my changes make things better or worse? Should I trade features for performance? Experimentation is the only way to answer questions like these. This unique book reveals sophisticated experimentation practices developed and proven in the world’s most competitive industries that will help you enhance machine learning systems, software applications, and quantitative trading solutions.

About the Book
Experimentation for Engineers: From A/B testing to Bayesian optimization delivers a toolbox of processes for optimizing software systems. You’ll start by learning the limits of A/B testing, and then graduate to advanced experimentation strategies that take advantage of machine learning and probabilistic methods. The skills you’ll master in this practical guide will help you minimize the costs of experimentation and quickly reveal which approaches and features deliver the best business results.

What's Inside
  • Design, run, and analyze an A/B test
  • Break the “feedback loops” caused by periodic retraining of ML models
  • Increase experimentation rate with multi-armed bandits
  • Tune multiple parameters experimentally with Bayesian optimization


About the Reader
For ML and software engineers looking to extract the most value from their systems. Examples in Python and NumPy.

About the Author
David Sweet has worked as a quantitative trader at GETCO and a machine learning engineer at Instagram. He teaches in the AI and Data Science master's programs at Yeshiva University.

Quotes
Putting an ‘improved’ version of a system into production can be really risky. This book focuses you on what is important!
- Simone Sguazza, University of Applied Sciences and Arts of Southern Switzerland

A must-have for anyone setting up experiments, from A/B tests to contextual bandits and Bayesian optimization.
- Maxim Volgin, KLM

Shows a non-mathematical programmer exactly what they need to write powerful mathematically-based testing algorithms.
- Patrick Goetz, The University of Texas at Austin

Gives you the tools you need to get the most out of your experiments.
- Marc-Anthony Taylor, Raiffeisen Bank International

Table of contents

  1. inside front cover
  2. Experimentation for Engineers
  3. Copyright
  4. dedication
  5. contents
  6. front matter
    1. preface
    2. acknowledgments
    3. about this book
      1. Who should read this book
      2. How this book is organized: A road map
      3. About the code
      4. liveBook discussion forum
    4. about the author
    5. about the cover illustration
  7. 1 Optimizing systems by experiment
    1. 1.1 Examples of engineering workflows
      1. 1.1.1 Machine learning engineer’s workflow
      2. 1.1.2 Quantitative trader’s workflow
      3. 1.1.3 Software engineer’s workflow
    2. 1.2 Measuring by experiment
      1. 1.2.1 Experimental methods
      2. 1.2.2 Practical problems and pitfalls
    3. 1.3 Why are experiments necessary?
      1. 1.3.1 Domain knowledge
      2. 1.3.2 Offline model quality
      3. 1.3.3 Simulation
    4. Summary
  8. 2 A/B testing: Evaluating a modification to your system
    1. 2.1 Take an ad hoc measurement
      1. 2.1.1 Simulate the trading system
      2. 2.1.2 Compare execution costs
    2. 2.2 Take a precise measurement
      1. 2.2.1 Mitigate measurement variation with replication
    3. 2.3 Run an A/B test
      1. 2.3.1 Analyze your measurements
      2. 2.3.2 Design the A/B test
      3. 2.3.3 Measure and analyze
      4. 2.3.4 Recap of A/B test stages
    4. Summary
  9. 3 Multi-armed bandits: Maximizing business metrics while experimenting
    1. 3.1 Epsilon-greedy: Account for the impact of evaluation on business metrics
      1. 3.1.1 A/B testing as a baseline
      2. 3.1.2 The epsilon-greedy algorithm
      3. 3.1.3 Deciding when to stop
    2. 3.2 Evaluating multiple system changes simultaneously
    3. 3.3 Thompson sampling: A more efficient MAB algorithm
      1. 3.3.1 Estimate the probability that an arm is the best
      2. 3.3.2 Randomized probability matching
      3. 3.3.3 The complete algorithm
    4. Summary
  10. 4 Response surface methodology: Optimizing continuous parameters
    1. 4.1 Optimize a single continuous parameter
      1. 4.1.1 Design: Choose parameter values to measure
      2. 4.1.2 Take the measurements
      3. 4.1.3 Analyze I: Interpolate between measurements
      4. 4.1.4 Analyze II: Optimize the business metric
      5. 4.1.5 Validate the optimal parameter value
    2. 4.2 Optimizing two or more continuous parameters
      1. 4.2.1 Design the two-parameter experiment
      2. 4.2.2 Measure, analyze, and validate the 2D experiment
    3. Summary
  11. 5 Contextual bandits: Making targeted decisions
    1. 5.1 Model a business metric offline to make decisions online
      1. 5.1.1 Model the business-metric outcome of a decision
      2. 5.1.2 Add the decision-making component
      3. 5.1.3 Run and evaluate the greedy recommender
    2. 5.2 Explore actions with epsilon-greedy
      1. 5.2.1 Missing counterfactuals degrade predictions
      2. 5.2.2 Explore with epsilon-greedy to collect counterfactuals
    3. 5.3 Explore parameters with Thompson sampling
      1. 5.3.1 Create an ensemble of prediction models
      2. 5.3.2 Randomized probability matching
    4. 5.4 Validate the contextual bandit
    5. Summary
  12. 6 Bayesian optimization: Automating experimental optimization
    1. 6.1 Optimizing a single compiler parameter, a visual explanation
      1. 6.1.1 Simulate the compiler
      2. 6.1.2 Run the initial experiment
      3. 6.1.3 Analyze: Model the response surface
      4. 6.1.4 Design: Select the parameter value to measure next
      5. 6.1.5 Design: Balance exploration with exploitation
    2. 6.2 Model the response surface with Gaussian process regression
      1. 6.2.1 Estimate the expected CPU time
      2. 6.2.2 Estimate uncertainty with GPR
    3. 6.3 Optimize over an acquisition function
      1. 6.3.1 Minimize the acquisition function
    4. 6.4 Optimize all seven compiler parameters
      1. 6.4.1 Random search
      2. 6.4.2 A complete Bayesian optimization
    5. Summary
  13. 7 Managing business metrics
    1. 7.1 Focus on the business
      1. 7.1.1 Don’t evaluate a model
      2. 7.1.2 Evaluate the product
    2. 7.2 Define business metrics
      1. 7.2.1 Be specific to your business
      2. 7.2.2 Update business metrics periodically
      3. 7.2.3 Business metric timescales
    3. 7.3 Trade off multiple business metrics
      1. 7.3.1 Reduce negative side effects
      2. 7.3.2 Evaluate with multiple metrics
    4. Summary
  14. 8 Practical considerations
    1. 8.1 Violations of statistical assumptions
      1. 8.1.1 Violation of the iid assumption
      2. 8.1.2 Nonstationarity
    2. 8.2 Don’t stop early
    3. 8.3 Control family-wise error
      1. 8.3.1 Cherry-picking increases the false-positive rate
      2. 8.3.2 Control false positives with the Bonferroni correction
    4. 8.4 Be aware of common biases
      1. 8.4.1 Confounder bias
      2. 8.4.2 Small-sample bias
      3. 8.4.3 Optimism bias
      4. 8.4.4 Experimenter bias
    5. 8.5 Replicate to validate results
      1. 8.5.1 Validate complex experiments
      2. 8.5.2 Monitor changes with a reverse A/B test
      3. 8.5.3 Measure quarterly changes with holdouts
    6. 8.6 Wrapping up
    7. Summary
  15. Appendix A Linear regression and the normal equations
    1. A.1 Univariate linear regression
    2. A.2 Multivariate linear regression
  16. Appendix B One factor at a time
  17. Appendix C Gaussian process regression
  18. index
  19. inside back cover

Product information

  • Title: Experimentation for Engineers
  • Author(s): David Sweet
  • Release date: February 2023
  • Publisher(s): Manning Publications
  • ISBN: 9781617298158