Large Language Models

Book description

This book talks about science and applications of Large Language Models (LLMs). You'll discover the common thread that drives some of the most revolutionary recent applications of artificial intelligence (AI): from conversational systems like ChatGPT or BARD, to machine translation, summary generation, question answering, and much more.

Table of contents

  1. Cover
  2. Half Title
  3. Title Page
  4. Copyright Page
  5. Dedication
  6. Table of Contents
  7. Preface
  8. Introduction
  9. Author Biography
  10. Chapter 1 ◾ Introduction
    1. 1.1 Generative Artificial Intelligence
      1. 1.1.1 Understanding the Mechanisms of Generative AI
      2. 1.1.2 Focus Areas in Generative AI
      3. 1.1.3 Applications
    2. 1.2 Generative Language Models
      1. 1.2.1 Popular Types of LLMs
    3. 1.3 Conclusions
    4. Notes
  11. Chapter 2 ◾ Fundamentals
    1. 2.1 Introduction
    2. 2.2 Autoregressive Language Models
    3. 2.3 Statistical Language Models
    4. 2.4 Neural Language Models
      1. 2.4.1 Pre-trained Language Models
    5. 2.5 Large Language Models
    6. 2.6 Word Embedding Models
    7. 2.7 Recurrent Neural Networks
      1. 2.7.1 Simple Recurrent Neural Networks
      2. 2.7.2 Long Short-Term Memory Networks
    8. 2.8 Autoencoders
      1. 2.8.1 The Information Bottleneck
      2. 2.8.2 Latent Variables
      3. 2.8.3 Autoencoder Architecture
      4. 2.8.4 Types of Autoencoders
    9. 2.9 Generative Adversarial Networks
      1. 2.9.1 The Generative Model
      2. 2.9.2 The Discriminative Model
    10. 2.10 Attention Models
      1. 2.10.1 Encoder-Decoder Paradigm
      2. 2.10.2 Attention to Sequence Models
    11. 2.11 Transformers
      1. 2.11.1 Encoder Layer
      2. 2.11.2 Positional Encoding
      3. 2.11.3 Residual Connections
      4. 2.11.4 Decoder Layer
      5. 2.11.5 Linear Layer and SoftMax
      6. 2.11.6 Training
      7. 2.11.7 Inference
      8. 2.11.8 Loss Function
    12. 2.12 Conclusions
  12. Chapter 3 ◾ Large Language Models
    1. 3.1 Introduction
      1. 3.1.1 Emergent Skills
      2. 3.1.2 Skills Enhancement Techniques
      3. 3.1.3 Corpora
      4. 3.1.4 Types of Training
      5. 3.1.5 Types of Learning
      6. 3.1.6 Types of Tokenization
    2. 3.2 BERT
      1. 3.2.1 Operation
      2. 3.2.2 Architecture
      3. 3.2.3 Model Input
      4. 3.2.4 Model Output
      5. 3.2.5 BERT-Based Pre-Trained Models
    3. 3.3 GPT
      1. 3.3.1 The GPT and GPT-2 Models
      2. 3.3.2 The GPT-3 Model
      3. 3.3.3 The GPT-4 Model
      4. 3.3.4 Reinforcement Learning from Human Feedback
    4. 3.4 PALM
      1. 3.4.1 Vocabulary
      2. 3.4.2 Training
      3. 3.4.3 PaLM-2
    5. 3.5 LLAMA
      1. 3.5.1 Pre-Training Data
      2. 3.5.2 Architecture
    6. 3.6 Language Model for Dialogue Applications (LAMDA)
      1. 3.6.1 Objectives and Metrics
      2. 3.6.2 Pre-Training of LaMDA
    7. 3.7 Megatron
      1. 3.7.1 Training Data
    8. 3.8 Other LLMs
    9. 3.9 Conclusions
    10. Notes
  13. Chapter 4 ◾ Model Evaluation
    1. 4.1 Introduction
    2. 4.2 Evaluation Tasks
      1. 4.2.1 Basic Evaluation Tasks
      2. 4.2.2 Advanced Assessment Tasks
      3. 4.2.3 Regulatory Compliance Tasks
    3. 4.3 Metrics and Benchmarks
    4. 4.4 Benchmark Datasets
      1. 4.4.1 SQuAD (Stanford Question-Answering Dataset)
      2. 4.4.2 GLUE (General Language Understanding Evaluation)
      3. 4.4.3 SNLI (Stanford Natural Language Inference)
      4. 4.4.4 ARC (Abstraction and Reasoning Corpus)
    5. 4.5 LLM Assessment
    6. 4.6 Conclusions
    7. Notes
  14. Chapter 5 ◾ Applications
    1. 5.1 Introduction
    2. 5.2 Sentiment Classification
      1. 5.2.1 Training
      2. 5.2.2 Testing and Validation
    3. 5.3 Semantic Search
    4. 5.4 Reasoning with Language Agents
    5. 5.5 Causal Inference
    6. 5.6 Natural Language Access to Databases
    7. 5.7 Loading and Querying for Own Data
    8. 5.8 Fine-Tuning a Model with Own Data
    9. 5.9 Prompt Design and Optimization
    10. 5.10 ChatGPT Conversational System
      1. 5.10.1 Performance Evaluation
    11. 5.11 BARD Conversational System
    12. 5.12 Conclusions
    13. Notes
  15. Chapter 6 ◾ Issues and Perspectives
    1. 6.1 Introduction
    2. 6.2 Emerging Skills
      1. 6.2.1 What Causes These Emergent Skills and What Do They Mean?
    3. 6.3 LLM in Production
    4. 6.4 Human-LLM Alignment
    5. 6.5 Ethics
    6. 6.6 Regulatory Issues
    7. 6.7 Complexity
    8. 6.8 Risks
    9. 6.9 Limitations
    10. 6.10 Conclusions
  16. Bibliography
  17. Index

Product information

  • Title: Large Language Models
  • Author(s): John Atkinson-Abutridy
  • Release date: October 2024
  • Publisher(s): CRC Press
  • ISBN: 9781040134306