AI Engineering

Book description

Recent breakthroughs in AI have not only increased demand for AI products, they've also lowered the barriers to entry for those who want to build AI products. The model-as-a-service approach has transformed AI from an esoteric discipline into a powerful development tool that anyone can use. Everyone, including those with minimal or no prior AI experience, can now leverage AI models to build applications. In this book, author Chip Huyen discusses AI engineering: the process of building applications with readily available foundation models.

The book starts with an overview of AI engineering, explaining how it differs from traditional ML engineering and discussing the new AI stack. The more AI is used, the more opportunities there are for catastrophic failures, and therefore, the more important evaluation becomes. This book discusses different approaches to evaluating open-ended models, including the rapidly growing AI-as-a-judge approach.

AI application developers will discover how to navigate the AI landscape, including models, datasets, evaluation benchmarks, and the seemingly infinite number of use cases and application patterns. You'll learn a framework for developing an AI application, starting with simple techniques and progressing toward more sophisticated methods, and discover how to efficiently deploy these applications.

  • Understand what AI engineering is and how it differs from traditional machine learning engineering
  • Learn the process for developing an AI application, the challenges at each step, and approaches to address them
  • Explore various model adaptation techniques, including prompt engineering, RAG, fine-tuning, agents, and dataset engineering, and understand how and why they work
  • Examine the bottlenecks for latency and cost when serving foundation models and learn how to overcome them
  • Choose the right model, dataset, evaluation benchmarks, and metrics for your needs

Chip Huyen works to accelerate data analytics on GPUs at Voltron Data. Previously, she was with Snorkel AI and NVIDIA, founded an AI infrastructure startup, and taught Machine Learning Systems Design at Stanford. She's the author of the book Designing Machine Learning Systems, an Amazon bestseller in AI.

AI Engineering builds upon and is complementary to Designing Machine Learning Systems (O'Reilly).

Publisher resources

View/Submit Errata

Table of contents

  1. Brief Table of Contents (Not Yet Final)
  2. 1. Introduction to Building AI Applications with Foundation Models
    1. The Rise of AI Engineering
      1. From Language Models to Large Language Models
      2. From Large Language Model to Foundation Model
      3. From Foundation Models to AI Engineering
    2. AI Applications with Foundation Models
      1. Use Cases
      2. Considerations When Building AI Applications
    3. The AI Engineering Stack
      1. Three Layers of the AI Stack
      2. AI Engineering vs. ML Engineering
      3. AI Engineering vs. Full-stack Engineering
    4. Summary
  3. 2. Understanding Foundation Models
    1. Training Data Distribution
      1. Multilingual Models
      2. Domain-Specific Models
    2. Pre-training
      1. Model Architecture
      2. Model Size
    3. Post-training
      1. Supervised Finetuning
      2. Alignment
    4. Sampling
      1. Sampling Fundamentals
      2. Test Time Sampling
      3. Structured Outputs
      4. The Probabilistic Nature of AI
    5. Summary
  4. 3. Evaluation Methodology
    1. Challenges of Evaluating Foundation Models
    2. Understanding Language Modeling Metrics
      1. Entropy
      2. Cross Entropy
      3. Bits-per-character and Bits-per-byte
      4. Perplexity
      5. Perplexity Interpretation and Use Cases
    3. Exact Evaluation
      1. Functional Correctness
      2. Similarity Measurements Against Reference Data
    4. AI-as-a-Judge
      1. Why AI-as-a-Judge
      2. How to Use AI-as-a-Judge
      3. Limitations of AI-as-a-Judge
      4. What Models Can Act as Judges?
    5. Ranking Models with Comparative Evaluation
      1. Challenges of Comparative Evaluation
      2. The Future of Comparative Evaluation
    6. Summary
  5. 4. Evaluate AI Systems
    1. Evaluation Criteria
      1. Domain-specific Capability
      2. Generation Capability
      3. Instruction-following Capability
      4. Cost and Latency
    2. Model Selection
      1. Model Selection Workflow
      2. Navigate Public Benchmarks
      3. Model Build vs. Buy
    3. Design Your Evaluation Pipeline
      1. 1. Evaluate All Components in a System
      2. 2. Create Evaluation Guideline
      3. 3. Define Evaluation Methods and Data
    4. Summary
  6. 5. Prompt Engineering, RAG, and Agents
    1. Prompt Engineering
      1. Prompt
      2. Prompt Engineering Best Practices
      3. Prompt Engineering Considerations
    2. RAG
      1. RAG Overview
      2. Retrieval Algorithms
      3. Retrieval Optimization
    3. Agents
      1. From RAGs to Agents
      2. Tool Use
      3. Planning
      4. Memory
    4. Summary

Product information

  • Title: AI Engineering
  • Author(s): Chip Huyen
  • Release date: February 2025
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098166304