Hands-On Large Language Models

Book description

AI has acquired startling new language capabilities in just the past few years. Driven by the rapid advances in deep learning, language AI systems are able to write and understand text better than ever before. This trend enables the rise of new features, products, and entire industries. Through the visually educational nature of this book, Python developers will learn the practical tools and concepts they need to use these capabilities today.

You'll learn how to use the power of pre-trained large language models for use cases like copywriting and summarization; create semantic search systems that go beyond keyword matching; build systems that classify and cluster text to enable scalable understanding of large numbers of text documents; and use existing libraries and pre-trained models for text classification, search, and clusterings.

This book also shows you how to:

  • Build advanced LLM pipelines to cluster text documents and explore the topics they belong to
  • Build semantic search engines that go beyond keyword search with methods like dense retrieval and rerankers
  • Learn various use cases where these models can provide value
  • Understand the architecture of underlying Transformer models like BERT and GPT
  • Get a deeper understanding of how LLMs are trained
  • Optimize LLMs for specific applications with methods such as generative model fine-tuning, contrastive fine-tuning, and in-context learning

Jay Alammar is Director and Engineering Fellow at Cohere (pioneering provider of large language models as an API).

Maarten Grootendorst is a Senior Clinical Data Scientist at Netherlands Comprehensive Cancer Organization (IKNL).

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. An Intuition-First Philosophy
    2. Prerequisites
    3. Book Structure
      1. Part I: Concepts
      2. Part II: Using Pre-trained Language Models
      3. Part III: Training and Fine-Tuning Language Models
    4. Hardware and Software Requirements
    5. API Keys
    6. Conventions Used in This Book
    7. Using Code Examples
    8. O’Reilly Online Learning
    9. How to Contact Us
    10. Acknowledgments
  2. 1. An Introduction to Large Language Models
    1. What Is Language AI?
    2. A Recent History of Language AI
      1. Representing Language as a Bag-of-Words
      2. Better Representations with Dense Vector Embeddings
      3. Types of Embeddings
      4. Encoding and Decoding Context with Attention
      5. Attention Is All You Need
      6. Representation Models: Encoder-Only Models
      7. Generative Models: Decoder-Only Models
      8. The Year of Generative AI
    3. The Moving Definition of a “Large Language Model”
    4. The Training Paradigm of Large Language Models
    5. Large Language Model Applications: What Makes Them So Useful?
    6. Responsible LLM Development and Usage
    7. Limited Resources Are All You Need
    8. Interfacing with Large Language Models
      1. Proprietary, Private Models
      2. Open Models
      3. Open Source Frameworks
    9. Generating Your First Text
    10. Summary
  3. 2. Tokens and Embeddings
    1. LLM Tokenization
      1. How Tokenizers Prepare the Inputs to the Language Model
      2. Downloading and Running an LLM
      3. How Does the Tokenizer Break Down Text?
      4. Word Versus Subword Versus Character Versus Byte Tokens
      5. Comparing Trained LLM Tokenizers
      6. Tokenizer Properties
    2. Token Embeddings
      1. A Language Model Holds Embeddings for the Vocabulary of Its Tokenizer
      2. Creating Contextualized Word Embeddings with Language Models
    3. Text Embeddings (for Sentences and Whole Documents)
    4. Word Embeddings Beyond LLMs
      1. Using Pre-trained Word Embeddings
      2. The Word2vec Algorithm and Contrastive Training
    5. Embeddings for Recommendation Systems
      1. Recommending Songs by Embeddings
      2. Training a Song Embedding Model
    6. Summary
  4. 3. Looking Inside Large Language Models
    1. An Overview of Transformer Models
      1. The Inputs and Outputs of a Trained Transformer LLM
      2. The Components of the Forward Pass
      3. Choosing a Single Token from the Probability Distribution (Sampling/Decoding)
      4. Parallel Token Processing and Context Size
      5. Speeding Up Generation by Caching Keys and Values
      6. Inside the Transformer Block
    2. Recent Improvements to the Transformer Architecture
      1. More Efficient Attention
      2. The Transformer Block
      3. Positional Embeddings (RoPE)
      4. Other Architectural Experiments and Improvements
    3. Summary
  5. 4. Text Classification
    1. The Sentiment of Movie Reviews
    2. Text Classification with Representation Models
      1. Model Selection
    3. Using a Task-Specific Model
      1. Classification Tasks That Leverage Embeddings
    4. Text Classification with Generative Models
      1. Using the Text-to-Text Transfer Transformer
      2. ChatGPT for Classification
    5. Summary
  6. 5. Text Clustering and Topic Modeling
    1. ArXiv’s Articles: Computation and Language
    2. A Common Pipeline for Text Clustering
      1. 1. Embedding Documents
      2. 2. Reducing the Dimensionality of Embeddings
      3. 3. Cluster the Reduced Embeddings
      4. Inspecting the Clusters
    3. From Text Clustering to Topic Modeling
      1. BERTopic: A Modular Topic Modeling Framework
      2. Adding a Special Lego Block
      3. The Text Generation Lego Block
    4. Summary
  7. 6. Prompt Engineering
    1. Using Text Generation Models
      1. Choosing a Text Generation Model
      2. Loading a Text Generation Model
      3. Controlling Model Output
    2. Intro to Prompt Engineering
      1. The Basic Ingredients of a Prompt
      2. Instruction-Based Prompting
    3. Advanced Prompt Engineering
      1. The Potential Complexity of a Prompt
      2. In-Context Learning: Providing Examples
      3. Chain Prompting: Breaking up the Problem
    4. Reasoning with Generative Models
      1. Chain-of-Thought: Think Before Answering
      2. Self-Consistency: Sampling Outputs
      3. Tree-of-Thought: Exploring Intermediate Steps
    5. Output Verification
      1. Providing Examples
      2. Grammar: Constrained Sampling
    6. Summary
  8. 7. Advanced Text Generation Techniques and Tools
    1. Model I/O: Loading Quantized Models with LangChain
    2. Chains: Extending the Capabilities of LLMs
      1. A Single Link in the Chain: Prompt Template
      2. A Chain with Multiple Prompts
    3. Memory: Helping LLMs to Remember Conversations
      1. Conversation Buffer
      2. Windowed Conversation Buffer
      3. Conversation Summary
    4. Agents: Creating a System of LLMs
      1. The Driving Power Behind Agents: Step-by-step Reasoning
      2. ReAct in LangChain
    5. Summary
  9. 8. Semantic Search and Retrieval-Augmented Generation (RAG)
    1. Overview of Semantic Search and Retrieval-Augmented Generation
    2. Semantic Search with Language Models
      1. Dense Retrieval
      2. Reranking
      3. Retrieval Evaluation Metrics
    3. Retrieval-Augmented Generation (RAG)
      1. From Search to RAG
      2. Example: Grounded Generation with an LLM API
      3. Example: RAG with Local Models
      4. Advanced RAG Techniques
      5. RAG Evaluation
    4. Summary
  10. 9. Multimodal Large Language Models
    1. Transformers for Vision
    2. Multimodal Embedding Models
      1. CLIP: Connecting Text and Images
      2. How Can CLIP Generate Multimodal Embeddings?
      3. OpenCLIP
    3. Making Text Generation Models Multimodal
      1. BLIP-2: Bridging the Modality Gap
      2. Preprocessing Multimodal Inputs
      3. Use Case 1: Image Captioning
      4. Use Case 2: Multimodal Chat-Based Prompting
    4. Summary
  11. 10. Creating Text Embedding Models
    1. Embedding Models
    2. What Is Contrastive Learning?
    3. SBERT
    4. Creating an Embedding Model
      1. Generating Contrastive Examples
      2. Train Model
      3. In-Depth Evaluation
      4. Loss Functions
    5. Fine-Tuning an Embedding Model
      1. Supervised
      2. Augmented SBERT
    6. Unsupervised Learning
      1. Transformer-Based Sequential Denoising Auto-Encoder
      2. Using TSDAE for Domain Adaptation
    7. Summary
  12. 11. Fine-Tuning Representation Models for Classification
    1. Supervised Classification
      1. Fine-Tuning a Pre-trained BERT Model
      2. Freezing Layers
    2. Few-Shot Classification
      1. SetFit: Efficient Fine-Tuning with Few Training Examples
      2. Fine-Tuning for Few-Shot Classification
    3. Continued Pre-training with Masked Language Modeling
    4. Named-Entity Recognition
      1. Preparing Data for Named-Entity Recognition
      2. Fine-Tuning for Named-Entity Recognition
    5. Summary
  13. 12. Fine-Tuning Generation Models
    1. The Three LLM Training Steps: Pre-training, Supervised Fine-Tuning, and Preference Tuning
    2. Supervised Fine-Tuning (SFT)
      1. Full Fine-Tuning
      2. Parameter-Efficient Fine-Tuning (PEFT)
    3. Instruction Tuning with QLoRA
      1. Templating Instruction Data
      2. Model Quantization
      3. LoRA Configuration
      4. Training Configuration
      5. Training
      6. Merge Weights
    4. Evaluating Generative Models
      1. Word-Level Metrics
      2. Benchmarks
      3. Leaderboards
      4. Automated Evaluation
      5. Human Evaluation
    5. Preference-Tuning / Alignment / RLHF
    6. Automating Preference Evaluation Using Reward Models
      1. The Inputs and Outputs of a Reward Model
      2. Training a Reward Model
      3. Training No Reward Model
    7. Preference Tuning with DPO
      1. Templating Alignment Data
      2. Model Quantization
      3. Training Configuration
      4. Training
    8. Summary
  14. Afterword
  15. About the Authors

Product information

  • Title: Hands-On Large Language Models
  • Author(s): Jay Alammar, Maarten Grootendorst
  • Release date: September 2024
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098150969