Hands-On Large Language Models

Book description

AI has acquired startling new language capabilities in just the past few years. Driven by rapid advances in deep learning, language AI systems are able to write and understand text better than ever before. This trend is enabling new features, products, and entire industries. Through this book's visually educational nature, readers will learn practical tools and concepts they need to use these capabilities today.

You'll understand how to use pretrained large language models for use cases like copywriting and summarization; create semantic search systems that go beyond keyword matching; and use existing libraries and pretrained models for text classification, search, and clusterings.

This book also helps you:

  • Understand the architecture of Transformer language models that excel at text generation and representation
  • Build advanced LLM pipelines to cluster text documents and explore the topics they cover
  • Build semantic search engines that go beyond keyword search, using methods like dense retrieval and rerankers
  • Explore how generative models can be used, from prompt engineering all the way to retrieval-augmented generation
  • Gain a deeper understanding of how to train LLMs and optimize them for specific applications using generative model fine-tuning, contrastive fine-tuning, and in-context learning

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. An Intuition-First Philosophy
    2. Prerequisites
    3. Book Structure
      1. Part I: Understanding Language Models
      2. Part II: Using Pretrained Language Models
      3. Part III: Training and Fine-Tuning Language Models
    4. Hardware and Software Requirements
    5. API Keys
    6. Conventions Used in This Book
    7. Using Code Examples
    8. O’Reilly Online Learning
    9. How to Contact Us
    10. Acknowledgments
  2. I. Understanding Language Models
  3. 1. An Introduction to Large Language Models
    1. What Is Language AI?
    2. A Recent History of Language AI
      1. Representing Language as a Bag-of-Words
      2. Better Representations with Dense Vector Embeddings
      3. Types of Embeddings
      4. Encoding and Decoding Context with Attention
      5. Attention Is All You Need
      6. Representation Models: Encoder-Only Models
      7. Generative Models: Decoder-Only Models
      8. The Year of Generative AI
    3. The Moving Definition of a “Large Language Model”
    4. The Training Paradigm of Large Language Models
    5. Large Language Model Applications: What Makes Them So Useful?
    6. Responsible LLM Development and Usage
    7. Limited Resources Are All You Need
    8. Interfacing with Large Language Models
      1. Proprietary, Private Models
      2. Open Models
      3. Open Source Frameworks
    9. Generating Your First Text
    10. Summary
  4. 2. Tokens and Embeddings
    1. LLM Tokenization
      1. How Tokenizers Prepare the Inputs to the Language Model
      2. Downloading and Running an LLM
      3. How Does the Tokenizer Break Down Text?
      4. Word Versus Subword Versus Character Versus Byte Tokens
      5. Comparing Trained LLM Tokenizers
      6. Tokenizer Properties
    2. Token Embeddings
      1. A Language Model Holds Embeddings for the Vocabulary of Its Tokenizer
      2. Creating Contextualized Word Embeddings with Language Models
    3. Text Embeddings (for Sentences and Whole Documents)
    4. Word Embeddings Beyond LLMs
      1. Using pretrained Word Embeddings
      2. The Word2vec Algorithm and Contrastive Training
    5. Embeddings for Recommendation Systems
      1. Recommending Songs by Embeddings
      2. Training a Song Embedding Model
    6. Summary
  5. 3. Looking Inside Large Language Models
    1. An Overview of Transformer Models
      1. The Inputs and Outputs of a Trained Transformer LLM
      2. The Components of the Forward Pass
      3. Choosing a Single Token from the Probability Distribution (Sampling/Decoding)
      4. Parallel Token Processing and Context Size
      5. Speeding Up Generation by Caching Keys and Values
      6. Inside the Transformer Block
    2. Recent Improvements to the Transformer Architecture
      1. More Efficient Attention
      2. The Transformer Block
      3. Positional Embeddings (RoPE)
      4. Other Architectural Experiments and Improvements
    3. Summary
  6. II. Using Pretrained Language Models
  7. 4. Text Classification
    1. The Sentiment of Movie Reviews
    2. Text Classification with Representation Models
    3. Model Selection
    4. Using a Task-Specific Model
    5. Classification Tasks That Leverage Embeddings
      1. Supervised Classification
      2. What If We Do Not Have Labeled Data?
    6. Text Classification with Generative Models
      1. Using the Text-to-Text Transfer Transformer
      2. ChatGPT for Classification
    7. Summary
  8. 5. Text Clustering and Topic Modeling
    1. ArXiv’s Articles: Computation and Language
    2. A Common Pipeline for Text Clustering
      1. Embedding Documents
      2. Reducing the Dimensionality of Embeddings
      3. Cluster the Reduced Embeddings
      4. Inspecting the Clusters
    3. From Text Clustering to Topic Modeling
      1. BERTopic: A Modular Topic Modeling Framework
      2. Adding a Special Lego Block
      3. The Text Generation Lego Block
    4. Summary
  9. 6. Prompt Engineering
    1. Using Text Generation Models
      1. Choosing a Text Generation Model
      2. Loading a Text Generation Model
      3. Controlling Model Output
    2. Intro to Prompt Engineering
      1. The Basic Ingredients of a Prompt
      2. Instruction-Based Prompting
    3. Advanced Prompt Engineering
      1. The Potential Complexity of a Prompt
      2. In-Context Learning: Providing Examples
      3. Chain Prompting: Breaking up the Problem
    4. Reasoning with Generative Models
      1. Chain-of-Thought: Think Before Answering
      2. Self-Consistency: Sampling Outputs
      3. Tree-of-Thought: Exploring Intermediate Steps
    5. Output Verification
      1. Providing Examples
      2. Grammar: Constrained Sampling
    6. Summary
  10. 7. Advanced Text Generation Techniques and Tools
    1. Model I/O: Loading Quantized Models with LangChain
    2. Chains: Extending the Capabilities of LLMs
      1. A Single Link in the Chain: Prompt Template
      2. A Chain with Multiple Prompts
    3. Memory: Helping LLMs to Remember Conversations
      1. Conversation Buffer
      2. Windowed Conversation Buffer
      3. Conversation Summary
    4. Agents: Creating a System of LLMs
      1. The Driving Power Behind Agents: Step-by-step Reasoning
      2. ReAct in LangChain
    5. Summary
  11. 8. Semantic Search and Retrieval-Augmented Generation
    1. Overview of Semantic Search and RAG
    2. Semantic Search with Language Models
      1. Dense Retrieval
      2. Reranking
      3. Retrieval Evaluation Metrics
    3. Retrieval-Augmented Generation (RAG)
      1. From Search to RAG
      2. Example: Grounded Generation with an LLM API
      3. Example: RAG with Local Models
      4. Advanced RAG Techniques
      5. RAG Evaluation
    4. Summary
  12. 9. Multimodal Large Language Models
    1. Transformers for Vision
    2. Multimodal Embedding Models
      1. CLIP: Connecting Text and Images
      2. How Can CLIP Generate Multimodal Embeddings?
      3. OpenCLIP
    3. Making Text Generation Models Multimodal
      1. BLIP-2: Bridging the Modality Gap
      2. Preprocessing Multimodal Inputs
      3. Use Case 1: Image Captioning
      4. Use Case 2: Multimodal Chat-Based Prompting
    4. Summary
  13. III. Training and Fine-Tuning Language Models
  14. 10. Creating Text Embedding Models
    1. Embedding Models
    2. What Is Contrastive Learning?
    3. SBERT
    4. Creating an Embedding Model
      1. Generating Contrastive Examples
      2. Train Model
      3. In-Depth Evaluation
      4. Loss Functions
    5. Fine-Tuning an Embedding Model
      1. Supervised
      2. Augmented SBERT
    6. Unsupervised Learning
      1. Transformer-Based Sequential Denoising Auto-Encoder
      2. Using TSDAE for Domain Adaptation
    7. Summary
  15. 11. Fine-Tuning Representation Models for Classification
    1. Supervised Classification
      1. Fine-Tuning a Pretrained BERT Model
      2. Freezing Layers
    2. Few-Shot Classification
      1. SetFit: Efficient Fine-Tuning with Few Training Examples
      2. Fine-Tuning for Few-Shot Classification
    3. Continued Pretraining with Masked Language Modeling
    4. Named-Entity Recognition
      1. Preparing Data for Named-Entity Recognition
      2. Fine-Tuning for Named-Entity Recognition
    5. Summary
  16. 12. Fine-Tuning Generation Models
    1. The Three LLM Training Steps: Pretraining, Supervised Fine-Tuning, and Preference Tuning
    2. Supervised Fine-Tuning (SFT)
      1. Full Fine-Tuning
      2. Parameter-Efficient Fine-Tuning (PEFT)
    3. Instruction Tuning with QLoRA
      1. Templating Instruction Data
      2. Model Quantization
      3. LoRA Configuration
      4. Training Configuration
      5. Training
      6. Merge Weights
    4. Evaluating Generative Models
      1. Word-Level Metrics
      2. Benchmarks
      3. Leaderboards
      4. Automated Evaluation
      5. Human Evaluation
    5. Preference-Tuning / Alignment / RLHF
    6. Automating Preference Evaluation Using Reward Models
      1. The Inputs and Outputs of a Reward Model
      2. Training a Reward Model
      3. Training No Reward Model
    7. Preference Tuning with DPO
      1. Templating Alignment Data
      2. Model Quantization
      3. Training Configuration
      4. Training
    8. Summary
  17. Afterword
  18. Index
  19. About the Authors

Product information

  • Title: Hands-On Large Language Models
  • Author(s): Jay Alammar, Maarten Grootendorst
  • Release date: September 2024
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098150969