Designing Large Language Model Applications

Book description

Transformer-based language models are powerful tools for solving a variety of language tasks and represent a phase shift in the field of natural language processing. But the transition from demos and prototypes to full-fledged applications has been slow. With this book, you'll learn the tools, techniques, and playbooks for building useful products that incorporate the power of language models.

Experienced ML researcher Suhas Pai provides practical advice on dealing with commonly observed failure modes and counteracting the current limitations of state-of-the-art models. You'll take a comprehensive deep dive into the Transformer architecture and its variants. And you'll get up-to-date with the taxonomy of language models, which can offer insight into which models are better at which tasks.

You'll learn:

  • Clever ways to deal with failure modes of current state-of-the-art language models, and methods to exploit their strengths for building useful products
  • How to develop an intuition about the Transformer architecture and the impact of each architectural decision
  • Ways to adapt pretrained language models to your own domain and use cases
  • How to select a language model for your domain and task from among the choices available, and how to deal with the build-versus-buy conundrum
  • Effective fine-tuning and parameter efficient fine-tuning, and few-shot and zero-shot learning techniques
  • How to interface language models with external tools and integrate them into an existing software ecosystem

Publisher resources

View/Submit Errata

Table of contents

  1. Brief Table of Contents (Not Yet Final)
  2. 1. Introduction
    1. Defining LLMs
    2. A Brief History of LLMs
      1. Early years
      2. The modern LLM era
    3. The impact of LLMs
    4. LLM usage in the enterprise
    5. Prompting
      1. Zero-shot prompting
      2. Few-shot prompting
      3. Chain-of-Thought prompting
      4. Adversarial Prompting
    6. Accessing LLMs through an API
    7. Strengths and limitations of LLMs
    8. Building your first chatbot prototype
    9. From prototype to production
    10. Summary
  3. 2. LLM Ingredients: Training Data
    1. Ingredients of an LLM
    2. Pre-training data requirements
    3. Popular pre-training datasets
    4. Training Data Preprocessing
      1. Data filtering and cleaning
      2. Selecting Quality Documents
      3. Deduplication
      4. Removing PII (Personally Identifiable Information)
      5. Training Set Decontamination
    5. Leveraging Pre-training Dataset Characteristics
    6. Bias and Fairness Issues in Pre-training Datasets
    7. Summary
  4. 3. Language Model Architectures
    1. Preliminaries
    2. Representing Meaning
    3. The Transformer Architecture
      1. Self-attention
      2. Positional Encoding
      3. Feed-forward networks
      4. Loss functions
    4. Intrinsic Model Evaluation
    5. Transformer backbones
      1. Encoder-only architectures
      2. Encoder-Decoder Architectures
      3. Decoder-only Architectures
      4. Mixture of Experts
    6. Pre-training models
    7. Summary
  5. 4. Adapting LLMs To Your Use Case
    1. Navigating the LLM Landscape
      1. Who are the LLM providers?
      2. Model flavors
      3. Open-source LLMs
    2. How to choose an LLM for your task
      1. Open-source vs. Proprietary LLMs
      2. LLM Evaluation
    3. Loading LLMs
      1. HuggingFace Accelerate
      2. Ollama
      3. LLM Inference APIs
    4. Decoding strategies
      1. Greedy decoding
      2. Beam Search
      3. Top-K sampling
      4. Top-P sampling
    5. Running inference on LLMs
      1. Structured outputs
    6. Model debugging and interpretability
    7. Summary
  6. 5. Fine-tuning LLMs
    1. The need for fine-tuning
    2. Fine-tuning: A full example
      1. Learning algorithms parameters
      2. Memory Optimization parameters
      3. Regularization parameters
      4. Noise embeddings
      5. Batch size
      6. Parameter Efficient Fine-tuning
      7. Working with reduced precision
      8. Putting it all together
    3. Fine-tuning Datasets
      1. Utilizing publicly available instruction-tuning datasets
      2. LLM-generated instruction-tuning datasets
    4. Summary
  7. 6. Advanced Fine-tuning Techniques
    1. Continual Pre-training
      1. Replay (Memory)
      2. Parameter Expansion
    2. Parameter-Efficient Fine-tuning
      1. Adding new parameters
      2. Subset methods
    3. Combining Multiple Models
      1. Model Ensembling
      2. Model Fusion
      3. Adapter Merging
    4. Summary
  8. 7. Interfacing LLMs with External Tools
    1. LLM Interaction Paradigms
      1. The Passive Approach
      2. The Explicit Approach
      3. The Agentic Approach
    2. Retrieval
      1. Retrieval Techniques
      2. Keyword Match and Probabilistic Methods
      3. Embeddings
      4. Optimizing embedding size
      5. Product Quantization
      6. Chunking
      7. Multi-level embeddings
      8. Vector Databases
      9. Rerankers
    3. Summary
  9. 8. Retrieval-Augmented Generation (RAG)
    1. The need for RAG
    2. Typical RAG scenarios
    3. The RAG pipeline
      1. Rewrite
      2. Retrieve
      3. Rerank
      4. Refine
      5. Insert
      6. Generate
    4. RAG for memory management
      1. MemGPT
    5. RAG for selecting in-context training examples
      1. LLM-R
    6. RAG for model training
      1. REALM
    7. Limitations of RAG
    8. RAG vs. Long Context
    9. RAG vs fine-tuning
    10. Summary
  10. 9. Application Design & System Architecture
    1. Multi-LLM architectures
      1. LLM Cascades
      2. Routers
      3. Task specialized LLMs
    2. Software Scaffolding
      1. Caches & Memory
      2. Verification modules
      3. Safety Guardrails
    3. Programming Paradigms
      1. DSPy
      2. LMQL
    4. Summary
  11. About the Author

Product information

  • Title: Designing Large Language Model Applications
  • Author(s): Suhas Pai
  • Release date: March 2025
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098150501