Optimizing Large Language Models

Published by Pearson

Advanced

Accelerate LLM Fine-Tuning and Optimize Hardware Resources

Learn cutting-edge techniques for faster and optimized fine-tuning of large language models
Focus on practical applications and hands-on experience to efficiently fine-tune models for specific downstream tasks
Emphasize best practices for managing large datasets, optimizing hardware resources, and deploying models in production environments

This class is designed to provide attendees with the latest techniques and best practices for efficiently fine-tuning large language models. With the increasing availability of large pre-trained language models, it is becoming easier to leverage the power of natural language processing to solve complex tasks such as text generation, sentiment analysis, and language translation. However, fine-tuning these models on larger datasets can be challenging and time consuming, hindering progress and limiting the effectiveness of the models.

By attending this class, attendees will learn advanced optimization algorithms and data augmentation strategies to speed up the fine-tuning process and improve model performance. Attendees will also gain practical experience in deploying models in production environments and learn best practices for managing large datasets and optimizing hardware resources. Ultimately, this class will enable attendees to take full advantage of the power of large language models and accelerate their progress in natural language processing applications.

What you’ll learn and how you can apply it

Techniques for efficient fine-tuning of large language models
Best practices for managing large datasets and optimizing hardware resources
Practical applications of large language models to solve real-world problems

And you’ll be able to:

Fine-tune large language models on large datasets
Optimize hardware resources to train faster
Apply large language models to solve real-world natural language processing challenges

This live event is for you because...

You work with data and want to take advantage of the power of large language models
You may have utilized a pre-trained model on your laptop and you are curious about using state-of-the-art techniques to optimize model training on larger datasets
While you want to employ multiple GPUs in your training, you are not sure how

Prerequisites

A working understanding of the foundational principles of deep learning and the basics of LLMs
Some prior experience using PyTorch
Experience with Python

Course Set-up

For a portion of the class we’ll work on Jupyter notebooks interactively in the cloud via Google Colab. To see the notebook, as well as the other code from the class, check out https://github.com/shaankhosla/optimizingllms.

Recommended Preparation

Watch: Catalyst Conference: NLP with ChatGPT (and other Large Language Models) by Jon Krohn
Watch: Introduction to Transformer Models for NLP: Using BERT, GPT, and More to Solve Modern Natural Language Processing Tasks by Sinan Ozdemir
Read: Practical Natural Language Processing by Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, Harshit Surana

Recommended Follow-up

Read: Quick Start Guide to Large Language Models by Sinan Ozdemir

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Segment 1: Introduction to Large Language Models (35 minutes)

Overview of pre-trained language models and their applications
Explanation of key libraries used in fine-tuning
Introduction to data augmentation techniques for improving model performance
Explanation of how to set up code to fine-tune a pre-trained model

Q&A(5 minutes) + Break (5 minutes)

Segment 2: Single GPU Training Techniques (60 minutes)

Best practices for managing large datasets
Gradient checkpointing
Gradient accumulation
Mixed precision
Dynamic padding
Smart batching

Break (10 minutes)

Segment 3: Multi-GPU Training Techniques (45 minutes)

Data parallel
Distributed data parallel
Model parallel
Overview of model compression techniques for optimizing model size and performance

Segment 4: Summary and Q&A (20 minutes)

Summary of key takeaways from the class
Open Q&A session

Your Instructor

Shaan Khosla
Shaan Khosla is a Senior Data Scientist at Nebula where he researches, designs, and develops NLP models. He previously worked at Bank of America on an internal machine learning consulting team, where he used LLMs to build proof of concept systems for various lines of business. Shaan holds a BSBA in Computer Science and Finance from the University of Miami and is currently completing a master’s degree in Data Science at NYU. He has published multiple peer-reviewed papers applying LLMs, topic modeling, and recommendation systems to the fields of biochemistry and healthcare.

linkedin link search