AI Engineering

Book description

Recent breakthroughs in AI have not only increased demand for AI products, they've also lowered the barriers to entry for those who want to build AI products. The model-as-a-service approach has transformed AI from an esoteric discipline into a powerful development tool that anyone can use. Everyone, including those with minimal or no prior AI experience, can now leverage AI models to build applications. In this book, author Chip Huyen discusses AI engineering: the process of building applications with readily available foundation models.

The book starts with an overview of AI engineering, explaining how it differs from traditional ML engineering and discussing the new AI stack. The more AI is used, the more opportunities there are for catastrophic failures, and therefore, the more important evaluation becomes. This book discusses different approaches to evaluating open-ended models, including the rapidly growing AI-as-a-judge approach.

AI application developers will discover how to navigate the AI landscape, including models, datasets, evaluation benchmarks, and the seemingly infinite number of use cases and application patterns. You'll learn a framework for developing an AI application, starting with simple techniques and progressing toward more sophisticated methods, and discover how to efficiently deploy these applications.

  • Understand what AI engineering is and how it differs from traditional machine learning engineering
  • Learn the process for developing an AI application, the challenges at each step, and approaches to address them
  • Explore various model adaptation techniques, including prompt engineering, RAG, fine-tuning, agents, and dataset engineering, and understand how and why they work
  • Examine the bottlenecks for latency and cost when serving foundation models and learn how to overcome them
  • Choose the right model, dataset, evaluation benchmarks, and metrics for your needs

Chip Huyen works to accelerate data analytics on GPUs at Voltron Data. Previously, she was with Snorkel AI and NVIDIA, founded an AI infrastructure startup, and taught Machine Learning Systems Design at Stanford. She's the author of the book Designing Machine Learning Systems, an Amazon bestseller in AI.

AI Engineering builds upon and is complementary to Designing Machine Learning Systems (O'Reilly).

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. What This Book Is About
    2. What This Book Is Not
    3. Who This Book Is For
    4. Navigating This Book
    5. Conventions Used in This Book
    6. Using Code Examples
    7. O’Reilly Online Learning
    8. How to Contact Us
    9. Acknowledgments
  2. 1. Introduction to Building AI Applications with Foundation Models
    1. The Rise of AI Engineering
      1. From Language Models to Large Language Models
      2. From Large Language Models to Foundation Models
      3. From Foundation Models to AI Engineering
    2. Foundation Model Use Cases
      1. Coding
      2. Image and Video Production
      3. Writing
      4. Education
      5. Conversational Bots
      6. Information Aggregation
      7. Data Organization
      8. Workflow Automation
    3. Planning AI Applications
      1. Use Case Evaluation
      2. Setting Expectations
      3. Milestone Planning
      4. Maintenance
    4. The AI Engineering Stack
      1. Three Layers of the AI Stack
      2. AI Engineering Versus ML Engineering
      3. AI Engineering Versus Full-Stack Engineering
    5. Summary
  3. 2. Understanding Foundation Models
    1. Training Data
      1. Multilingual Models
      2. Domain-Specific Models
    2. Modeling
      1. Model Architecture
      2. Model Size
    3. Post-Training
      1. Supervised Finetuning
      2. Preference Finetuning
    4. Sampling
      1. Sampling Fundamentals
      2. Sampling Strategies
      3. Test Time Compute
      4. Structured Outputs
      5. The Probabilistic Nature of AI
    5. Summary
  4. 3. Evaluation Methodology
    1. Challenges of Evaluating Foundation Models
    2. Understanding Language Modeling Metrics
      1. Entropy
      2. Cross Entropy
      3. Bits-per-Character and Bits-per-Byte
      4. Perplexity
      5. Perplexity Interpretation and Use Cases
    3. Exact Evaluation
      1. Functional Correctness
      2. Similarity Measurements Against Reference Data
      3. Introduction to Embedding
    4. AI as a Judge
      1. Why AI as a Judge?
      2. How to Use AI as a Judge
      3. Limitations of AI as a Judge
      4. What Models Can Act as Judges?
    5. Ranking Models with Comparative Evaluation
      1. Challenges of Comparative Evaluation
      2. The Future of Comparative Evaluation
    6. Summary
  5. 4. Evaluate AI Systems
    1. Evaluation Criteria
      1. Domain-Specific Capability
      2. Generation Capability
      3. Instruction-Following Capability
      4. Cost and Latency
    2. Model Selection
      1. Model Selection Workflow
      2. Model Build Versus Buy
      3. Navigate Public Benchmarks
    3. Design Your Evaluation Pipeline
      1. Step 1. Evaluate All Components in a System
      2. Step 2. Create an Evaluation Guideline
      3. Step 3. Define Evaluation Methods and Data
    4. Summary
  6. 5. Prompt Engineering
    1. Introduction to Prompting
      1. In-Context Learning: Zero-Shot and Few-Shot
      2. System Prompt and User Prompt
      3. Context Length and Context Efficiency
    2. Prompt Engineering Best Practices
      1. Write Clear and Explicit Instructions
      2. Provide Sufficient Context
      3. Break Complex Tasks into Simpler Subtasks
      4. Give the Model Time to Think
      5. Iterate on Your Prompts
      6. Evaluate Prompt Engineering Tools
      7. Organize and Version Prompts
    3. Defensive Prompt Engineering
      1. Proprietary Prompts and Reverse Prompt Engineering
      2. Jailbreaking and Prompt Injection
      3. Information Extraction
      4. Defenses Against Prompt Attacks
    4. Summary
  7. 6. RAG and Agents
    1. RAG
      1. RAG Architecture
      2. Retrieval Algorithms
      3. Retrieval Optimization
      4. RAG Beyond Texts
    2. Agents
      1. Agent Overview
      2. Tools
      3. Planning
      4. Agent Failure Modes and Evaluation
    3. Memory
    4. Summary
  8. 7. Finetuning
    1. Finetuning Overview
    2. When to Finetune
      1. Reasons to Finetune
      2. Reasons Not to Finetune
      3. Finetuning and RAG
    3. Memory Bottlenecks
      1. Backpropagation and Trainable Parameters
      2. Memory Math
      3. Numerical Representations
      4. Quantization
    4. Finetuning Techniques
      1. Parameter-Efficient Finetuning
      2. Model Merging and Multi-Task Finetuning
      3. Finetuning Tactics
    5. Summary
  9. 8. Dataset Engineering
    1. Data Curation
      1. Data Quality
      2. Data Coverage
      3. Data Quantity
      4. Data Acquisition and Annotation
    2. Data Augmentation and Synthesis
      1. Why Data Synthesis
      2. Traditional Data Synthesis Techniques
      3. AI-Powered Data Synthesis
      4. Model Distillation
    3. Data Processing
      1. Inspect Data
      2. Deduplicate Data
      3. Clean and Filter Data
      4. Format Data
    4. Summary
  10. 9. Inference Optimization
    1. Understanding Inference Optimization
      1. Inference Overview
      2. Inference Performance Metrics
      3. AI Accelerators
    2. Inference Optimization
      1. Model Optimization
      2. Inference Service Optimization
    3. Summary
  11. 10. AI Engineering Architecture and User Feedback
    1. AI Engineering Architecture
      1. Step 1. Enhance Context
      2. Step 2. Put in Guardrails
      3. Step 3. Add Model Router and Gateway
      4. Step 4. Reduce Latency with Caches
      5. Step 5. Add Agent Patterns
      6. Monitoring and Observability
      7. AI Pipeline Orchestration
    2. User Feedback
      1. Extracting Conversational Feedback
      2. Feedback Design
      3. Feedback Limitations
    3. Summary
  12. Epilogue
  13. Index
  14. About the Author

Product information

  • Title: AI Engineering
  • Author(s): Chip Huyen
  • Release date: December 2024
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098166304