What We Learned from a Year of Building with LLMs

Book description

Ready to build real-world applications with large language models? With the pace of improvements over the past year, LLMs have become good enough for use in real-world applications. LLMs are also broadly accessible, allowing practitioners besides ML engineers and scientists to build intelligence into their products.

In this report, six experts in AI and machine learning present crucial, yet often neglected, ML lessons and methodologies essential for developing products based on LLMs. Awareness of these concepts can give you a competitive advantage against most others in the field.

Over the past year, authors Eugene Yan, Brian Bischof, Charles Frye, Hamel Husain, Jason Liu, and Shreya Shankar have been busy testing and refining these methodologies by building real-world applications on top of LLMs. In this report, they have distilled these lessons for the benefit of the community.

Table of contents

  1. Introduction
    1. Contact Us
    2. Acknowledgments
  2. 1. Tactics: The Emerging LLM Stack
    1. Prompting
      1. Focus on Getting the Most Out of Fundamental Prompting Techniques
      2. Structure Your Inputs and Outputs
      3. Have Small Prompts That Do One Thing, and Only One Thing, Well
      4. Craft Your Context Tokens
    2. Information Retrieval/RAG
      1. Keyword Search
      2. Prefer RAG over Fine-Tuning for New Knowledge
      3. Long-Context Models Won’t Make RAG Obsolete
    3. Tuning and Optimizing Workflows
      1. Step-by-Step, Multiturn “Flows” Can Give Large Boosts
      2. Prioritize Deterministic Workflows for Now
      3. Getting More Diverse Outputs Beyond Temperature
      4. Caching Is Underrated
      5. When to Fine-Tune
    4. Evaluation and Monitoring
      1. Create a Few Assertion-Based Unit Tests From Real Input/Output Samples
      2. LLM-as-Judge Can Work (Somewhat), but It’s Not a Silver Bullet
      3. The “Intern Test” for Evaluating Generations
      4. Overemphasizing Certain Evals Can Hurt Overall Performance
      5. Simplify Annotation to Binary Tasks or Pairwise Comparisons
      6. (Reference-Free) Evals and Guardrails Can Be Used Interchangeably
      7. LLMs Will Return Output Even When They Shouldn’t
      8. Hallucinations Are a Stubborn Problem
  3. 2. Operations: Developing and Managing LLM Applications and the Teams That Build Them
    1. Data
      1. Check for Development-Prod Skew
      2. Look at Samples of LLM Inputs and Outputs Every Day
    2. Working with Models
      1. Generate Structured Output to Ease Downstream Integration
      2. Migrating Prompts Across Models is a Pain in the Ass
      3. Version and Pin Your Models
      4. Choose the Smallest Model That Gets the Job Done
    3. Product
      1. Involve Design Early and Often
      2. Design Your UX for Human-in-the-Loop
      3. Prioritize Your Hierarchy of Needs Ruthlessly
      4. Calibrate Your Risk Tolerance Based on the Use Case
    4. Team and Roles
      1. Focus on Process, Not Tools
      2. Always Be Experimenting
      3. Empower Everyone to Use New AI Technology
      4. Don’t Fall into the Trap of “AI Engineering Is All I Need”
  4. 3. Strategy: Building with LLMs without Getting Outmaneuvered
    1. No GPUs Before PMF
      1. Training From Scratch (Almost) Never Makes Sense
      2. Don’t Fine-Tune Until You’ve Proven It’s Necessary
      3. Start with Inference APIs, but Don’t be Afraid of Self-Hosting
    2. Iterate to Something Great
      1. The Model Isn’t the Product; the System Around It Is
      2. Build Trust by Starting Small
      3. Build LLMOps, but Build It for the Right Reason: Faster Iteration
      4. Don’t Build LLM Features You Can Buy
      5. AI in the Loop; Humans at the Center
    3. Start with Prompting, Evals, and Data Collection
      1. Prompt Engineering Comes First
      2. Build Evals and Kickstart a Data Flywheel
    4. The High-Level Trend of Low-Cost Cognition
      1. Enough 0 to 1 Demos, It’s Time for 1 to N Products
  5. About the Authors

Product information

  • Title: What We Learned from a Year of Building with LLMs
  • Author(s): Eugene Yan, Bryan Bischof, Charles Frye, Hamel Husain, Jason Liu, Shreya Shankar
  • Release date: July 2024
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098176709