Pretrain Vision and Large Language Models in Python

Book description

Master the art of training vision and large language models with conceptual fundaments and industry-expert guidance. Learn about AWS services and design patterns, with relevant coding examples

Key Features

  • Learn to develop, train, tune, and apply foundation models with optimized end-to-end pipelines
  • Explore large-scale distributed training for models and datasets with AWS and SageMaker examples
  • Evaluate, deploy, and operationalize your custom models with bias detection and pipeline monitoring

Book Description

Foundation models have forever changed machine learning. From BERT to ChatGPT, CLIP to Stable Diffusion, when billions of parameters are combined with large datasets and hundreds to thousands of GPUs, the result is nothing short of record-breaking. The recommendations, advice, and code samples in this book will help you pretrain and fine-tune your own foundation models from scratch on AWS and Amazon SageMaker, while applying them to hundreds of use cases across your organization.

With advice from seasoned AWS and machine learning expert Emily Webber, this book helps you learn everything you need to go from project ideation to dataset preparation, training, evaluation, and deployment for large language, vision, and multimodal models. With step-by-step explanations of essential concepts and practical examples, you’ll go from mastering the concept of pretraining to preparing your dataset and model, configuring your environment, training, fine-tuning, evaluating, deploying, and optimizing your foundation models.

You will learn how to apply the scaling laws to distributing your model and dataset over multiple GPUs, remove bias, achieve high throughput, and build deployment pipelines.

By the end of this book, you’ll be well equipped to embark on your own project to pretrain and fine-tune the foundation models of the future.

What you will learn

  • Find the right use cases and datasets for pretraining and fine-tuning
  • Prepare for large-scale training with custom accelerators and GPUs
  • Configure environments on AWS and SageMaker to maximize performance
  • Select hyperparameters based on your model and constraints
  • Distribute your model and dataset using many types of parallelism
  • Avoid pitfalls with job restarts, intermittent health checks, and more
  • Evaluate your model with quantitative and qualitative insights
  • Deploy your models with runtime improvements and monitoring pipelines

Who this book is for

If you’re a machine learning researcher or enthusiast who wants to start a foundation modelling project, this book is for you. Applied scientists, data scientists, machine learning engineers, solution architects, product managers, and students will all benefit from this book. Intermediate Python is a must, along with introductory concepts of cloud computing. A strong understanding of deep learning fundamentals is needed, while advanced topics will be explained. The content covers advanced machine learning and cloud techniques, explaining them in an actionable, easy-to-understand way.

Table of contents

  1. Pretrain Vision and Large Language Models in Python
  2. Foreword
  3. Contributors
  4. About the author
  5. Acknowledgment
  6. About the reviewer
  7. Preface
    1. Who is this book for?
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Conventions used
    6. Get in touch
    7. Share Your Thoughts
    8. Download a free PDF copy of this book
  8. Part 1: Before Pretraining
  9. Chapter 1: An Introduction to Pretraining Foundation Models
    1. The art of pretraining and fine-tuning
    2. The Transformer model architecture and self-attention
    3. State-of-the-art vision and language models
      1. Top vision models as of April 2023
      2. Contrastive pretraining and natural language supervision
      3. Top language models as of April 2023
      4. Language technique spotlight – causal modeling and the scaling laws
    4. Encoders and decoders
    5. Summary
    6. References
  10. Chapter 2: Dataset Preparation: Part One
    1. Finding a dataset and use case for foundation modeling
      1. Top pretraining use cases by industry
    2. Delta – how different is your dataset?
      1. Use the scaling laws to size your datasets
      2. Fundamentals – scaling laws of neural language models
    3. Bias detection and mitigation
    4. Enhancing your dataset – multilingual, multimodal, and augmentations
    5. Summary
    6. References
  11. Chapter 3: Model Preparation
    1. Finding your best base model
      1. Starting with the smallest base model you can
      2. Trade-off – simplicity versus complexity
    2. Finding your pretraining loss function
      1. Pretraining loss functions in vision – ViT and CoCa
      2. Pretraining loss functions in language – Alexa Teacher Model
      3. Changing your pretraining loss function
    3. Solving for your model size
      1. Practical approaches to solving for your model size
      2. Not all scaling laws are created equal
    4. Planning future experiments
    5. Summary
    6. References
  12. Part 2: Configure Your Environment
  13. Chapter 4: Containers and Accelerators on the Cloud
    1. What are accelerators and why do they matter?
    2. Getting ready to use your accelerators
      1. How to use accelerators on AWS – Amazon SageMaker
    3. Optimizing accelerator performance
      1. Hyperparameters
      2. Infrastructure optimizations for accelerators on AWS
    4. Troubleshooting accelerator performance
    5. Summary
    6. References
  14. Chapter 5: Distribution Fundamentals
    1. Understanding key concepts – data and model parallelism
      1. What data parallel is all about
      2. What model parallel is all about
    2. Combining model and data parallel
    3. Distributed training on Amazon SageMaker
      1. Distributed training software
      2. SM DDP
      3. SMP library
    4. Advanced techniques to reduce GPU memory
      1. Tensor parallelism
      2. Optimizer state sharding
      3. Activation checkpointing
      4. Sharded data parallelism
    5. Bringing it all home with examples from models today
      1. Stable Diffusion – data parallelism at scale
      2. GPT-3 – model and data parallelism at scale
    6. Summary
    7. References
  15. Chapter 6: Dataset Preparation: Part Two, the Data Loader
    1. Introducing the data loader in Python
    2. Building and testing your own data loader – a case study from Stable Diffusion
    3. Creating embeddings – tokenizers and other key steps for smart features
    4. Optimizing your data pipeline on Amazon SageMaker
    5. Transforming deep learning datasets at scale on AWS
    6. Summary
    7. References
  16. Part 3: Train Your Model
  17. Chapter 7: Finding the Right Hyperparameters
    1. Hyperparameters – batch size, learning rate, and more
      1. Key hyperparameters in vision and language
    2. Tuning strategies
    3. Hyperparameter tuning for foundation models
    4. Scaling up as a function of world size with SageMaker
      1. Tuning on a sample of your data and updating based on world size
    5. Summary
    6. References
  18. Chapter 8: Large-Scale Training on SageMaker
    1. Optimizing your script for SageMaker training
      1. Importing packages
      2. Argument parsing
    2. Top usability features for SageMaker training
      1. Warm pools for rapid experimentation
      2. SSM and SSH into training instances
      3. Track jobs and experiments to replicate results
    3. Summary
    4. References
  19. Chapter 9: Advanced Training Concepts
    1. Evaluating and improving throughput
      1. Calculating model TFLOPS
    2. Using Flash Attention to speed up your training runs
    3. Speeding up your jobs with compilation
      1. Integrating compilation into your PyTorch scripts
    4. Amazon SageMaker Training Compiler and Neo
      1. Best practices for compilation
    5. Running compiled models on Amazon’s Trainium and Inferentia custom hardware
    6. Solving for an optimal training time
    7. Summary
    8. References
  20. Part 4: Evaluate Your Model
  21. Chapter 10: Fine-Tuning and Evaluating
    1. Fine-tuning for language, text, and everything in between
      1. Fine-tuning a language-only model
      2. Fine-tuning vision-only models
      3. Fine-tuning vision-language models
    2. Evaluating foundation models
      1. Model evaluation metrics for vision
      2. Model evaluation metrics in language
      3. Model evaluation metrics in joint vision-language tasks
      4. Incorporating the human perspective with labeling through SageMaker Ground Truth
    3. Reinforcement learning from human feedback
    4. Summary
    5. References
  22. Chapter 11: Detecting, Mitigating, and Monitoring Bias
    1. Detecting bias in ML models
      1. Detecting bias in large vision and language models
    2. Mitigating bias in vision and language models
      1. Bias mitigation in language – counterfactual data augmentation and fair loss functions
      2. Bias mitigation in vision – reducing correlation dependencies and solving sampling issues
    3. Monitoring bias in ML models
    4. Detecting, mitigating, and monitoring bias with SageMaker Clarify
    5. Summary
    6. References
  23. Chapter 12: How to Deploy Your Model
    1. What is model deployment?
    2. What is the best way to host my model?
      1. Model deployment options on AWS with SageMaker
    3. Why should I shrink my model, and how?
      1. Model compilation
      2. Knowledge distillation
      3. Quantization
    4. Hosting distributed models on SageMaker
    5. Model servers and end-to-end hosting optimizations
    6. Summary
    7. References
  24. Part 5: Deploy Your Model
  25. Chapter 13: Prompt Engineering
    1. Prompt engineering – the art of getting more with less
    2. From few- to zero-shot learning
    3. Text-to-image prompt engineering tips
    4. Image-to-image prompt engineering tips
      1. Upscaling
      2. Masking
      3. Prompting for object-to-image with DreamBooth
    5. Prompting large language models
      1. Instruction fine-tuning
      2. Chain-of-thought prompting
      3. Summarization
      4. Defending against prompt injections and jailbreaking
    6. Advanced techniques – prefix and prompt tuning
      1. Prefix tuning
      2. Prompt tuning
    7. Summary
    8. References
  26. Chapter 14: MLOps for Vision and Language
    1. What is MLOps?
      1. Common MLOps pipelines
    2. Continuous integration and continuous deployment
    3. Model monitoring and human-in-the-loop
    4. MLOps for foundation models
      1. MLOps for vision
    5. AWS offerings for MLOps
      1. A quick introduction to SageMaker Pipelines
    6. Summary
    7. References
  27. Chapter 15: Future Trends in Pretraining Foundation Models
    1. Techniques for building applications for LLMs
      1. Building interactive dialogue apps with open-source stacks
      2. Using RAG to ensure high accuracy in LLM applications
      3. Is generation the new classification?
      4. Human-centered design for building applications with LLMs
    2. Other generative modalities
    3. AWS offerings in foundation models
    4. The future of foundation models
    5. The future of pretraining
    6. Summary
    7. References
  28. Index
    1. Why subscribe?
  29. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share Your Thoughts
    3. Download a free PDF copy of this book

Product information

  • Title: Pretrain Vision and Large Language Models in Python
  • Author(s): Emily Webber, Andrea Olgiati
  • Release date: May 2023
  • Publisher(s): Packt Publishing
  • ISBN: 9781804618257