Hands on Natural Language Generation and GPT
Published by Pearson
Introduction to Transformers and the GPT Architecture for NLP
This training will focus on how the GPT family of models are used for NLP tasks including abstractive text summarization and natural language generation. The training will begin with an introduction to necessary concepts including masked self attention, language models, and transformers and then build on those concepts to introduce the GPT architecture. We will then move into how GPT is used for multiple natural language processing tasks with hands-on examples of using pre-trained GPT-2 models as well as fine-tuning these models on custom corpora. We will also use modern GPT models like ChatGPT and GPT-3 to see how to engineer prompts effectively and efficiently for use in production.
GPT models are some of the most relevant NLP architectures today and it is closely related to other important NLP deep learning models like BERT. Both of these models are derived from the newly invented transformer architecture and represent an inflection point in how machines process language and context.
The Natural Language Processing with Next-Generation Transformer Architectures series of online trainings provides a comprehensive overview of state-of-the-art natural language processing (NLP) models including GPT and BERT which are derived from the modern attention-driven transformer architecture and the applications these models are used to solve today. All of the trainings in the series blend theory and application through the combination of visual mathematical explanations, straightforward applicable Python examples within hands-on Jupyter notebook demos, and comprehensive case studies featuring modern problems solvable by NLP models. (Note that at any given time, only a subset of these classes will be scheduled and open for registration.)
What you’ll learn and how you can apply it
- What transformers are and how they are used for NLP and non-NLP tasks
- How GPT models are derived from the Transformer
- How GPT is pre-trained to learn language and context
- How to use pre-trained GPT models to perform NLP tasks
- How to fine-tune GPT models on custom corpora
This live event is for you because...
- You’re an advanced Machine Learning Engineer with experience with Transformers, Neural Networks, and NLP
- You’re interested in state-of-the-art NLP Architecture
- You are comfortable using libraries like Tensorflow or PyTorch
Prerequisites
- Python 3 proficiency with some familiarity with working in interactive Python environments including Notebooks (Jupyter / Google Colab / Kaggle Kernels).
Course Set-up
- A github repository with the slides / code / links
- Attendees will need to have access to the notebooks in the github
Recommended Preparation
- Attend: Deploying GPT and Large Language Models and Hands-on Natural Language Generation and GPT by Sinan Ozdemir
- Watch: Natural Language Processing NLP by Bruno Goncalves
- Attend: Natural Language Processing NLP for Everyone by Bruno Goncalves
- Watch: Natural language processing using transformer architectures by Aurélien Géron
Recommended Follow-up
- Watch: Introduction to Transformer Models for NLP by Sinan Ozdemir
- Read: Transformers for Natural Language Processing - Second Edition by Denis Rothman, Antonio Gulli
- Audio: AI Unveiled (Audio) by Sinan Ozdemir
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Segment 1: Introduction to Transformers and Language Models Length (30 mins)
- Introduction to Attention in Neural Networks
- Introduction to Transformers
- Introduction to Language Models
Segment 2: Introduction to GPT Length (45 mins)
- What GPT is and how it’s built off of the Transformer architecture
- Auto-regressive models and masked self-attention
- How GPT is used for Natural Language Processing
- GPT’s pre-training phase to learn language and context
- How to fine-tune GPT to perform NLP tasks
Break: 10 minutes
Q&A: 15 minutes
Segment 3: Using a pre-trained GPT-2 model to perform NLP tasks (30 mins)
- Using the transformers package to load the pre-trained GPT-2 model
- Inspecting the parameters of GPT-2
- Using the pre-trained GPT-2 model to perform text generation
Segment 4: Fine-tuning GPT-2 to perform NLP tasks (45 mins)
- Fine-tuning a pre-trained GPT-2 on a custom corpus
- Using a fine-tuned GPT-2 to perform abstractive text summarization
Break: 10 minutes
Q&A: 15 minutes
Segment 5: Hands on GPT-3 and ChatGPT (30 mins)
- Understanding the differences between GPT-3 and ChatGPT
- Prompt Engineering
- Using modern GPT in production
Final Q&A: 10 minutes
Your Instructor
Sinan Ozdemir
Sinan Ozdemir is founder and CTO of LoopGenius, where he uses state-of-the-art AI to help people create and run their businesses. He has lectured in data science at Johns Hopkins University and authored multiple books, videos and numerous online courses on data science, machine learning, and generative AI. He also founded the recently acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. Sinan most recently published Quick Guide to Large Language Models, and launched a podcast audio series, AI Unveiled. Ozdemir holds a master’s degree in pure mathematics from Johns Hopkins University.