BERT Transformer Architecture for NLP
Published by Pearson
Introduction to Natural Language Processing with Next-Generation Transformer Architectures
This training will focus on how BERT is used for a wide variety of NLP tasks including text classification, question answering, and machine translation. The training will begin with an introduction to necessary concepts including language models and transformers and then build on those concepts to introduce the BERT architecture. We will then move into how BERT is architected specifically as a language model that can be used for multiple natural language processing tasks with hands-on examples of fine-tuning pre-trained BERT models.
BERT is one of the most relevant NLP architectures today and it is closely related to other important NLP deep learning models like GPT-3. Both of these models are derived from the newly invented transformer architecture and represent an inflection point in how machines process language and context.
What you’ll learn and how you can apply it
- What a language model is and how they learn languages and context
- What transformers are and how they are used for NLP and non-NLP tasks
- How BERT is derived from the Transformer
- The Steps to using BERT: pre-training and fine-tuning
- How BERT can be used to solve a variety of NLP tasks
This live event is for you because...
- You’re an advanced Machine Learning Engineer with experience with Transformers, Neural Networks, and NLP
- You’re interested in state-of-the-art NLP Architecture
- You are comfortable using libraries like Tensorflow or PyTorch
Prerequisites
- Python 3 proficiency with some familiarity with working in interactive Python environments including Notebooks (Jupyter / Google Colab / Kaggle Kernels).
- (Video) NLP Using Transformer Architectures by Aurélien Géron: https://www.oreilly.com/library/view/natural-language-processing/0636920373605/
- The Illustrated Transformer: https://jalammar.github.io/illustrated-transformer/
- (Live Training) Leveraging NLP and Word Embeddings in Machine Learning Projects by Maryam Jahanshahi: https://www.oreilly.com/live-training/courses/leveraging-nlp-and-word-embeddings-in-machine-learning-projects/0636920469889/
Course Set-up
- A github repository with the slides / code / Colab links will be provided upon completion
- Attendees will need to have access to the Colab notebooks linked in the github
- Code in the Colab notebook runs in the cloud so attendees will not need to install anything on their machines
Recommended Preparation
- Transformers from scratch: http://peterbloem.nl/blog/transformers
- (Video) Introduction to NLP by Bruno Goncalves: https://www.informit.com/store/natural-language-processing-livelessons-9780135258859
- Google AI Blog: Open Sourcing BERT: https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html
Recommended Follow-up
- (Book) Transformers for Natural Language Processing by Denis Rothman: https://www.oreilly.com/library/view/transformers-for-natural/9781800565791
- BERT Explained: https://towardsml.com/2019/09/17/bert-explained-a-complete-guide-with-theory-and-tutorial/
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Segment 1: Introduction to Transformers and Language Models (30 min)
- Introduction to Attention in Neural Networks
- Introduction to Transformers
- Introduction to Language Models
Q&A and Break (15 min)
Segment 2: Introduction to BERT (60 min)
- What is BERT and how it’s built off of the Transformer architecture
- How BERT is used for Natural Language Processing
- BERT’s pre-training phase to learn language and context
- Fine-tuning BERT to NLP tasks
Q&A and Break (15 min)
Segment 3: Fine-tuning BERT to perform NLP tasks (1 hour, 15 mins)
- Fine-tuning a pre-trained BERT to perform sentiment analysis with tweets
- Fine-tuning a pre-trained BERT to perform question/answering with SQuAD
Q&A and Break (15 min)
Segment 4: Course wrap-up and next steps (30 min)
- Flavors of BERT – BERT-base vs BERT-large vs roBERTa
- Next steps with BERT and Transformers
- Final QA
Your Instructor
Sinan Ozdemir
Sinan Ozdemir is founder and CTO of LoopGenius, where he uses state-of-the-art AI to help people create and run their businesses. He has lectured in data science at Johns Hopkins University and authored multiple books, videos and numerous online courses on data science, machine learning, and generative AI. He also founded the recently acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. Sinan most recently published Quick Guide to Large Language Models, and launched a podcast audio series, AI Unveiled. Ozdemir holds a master’s degree in pure mathematics from Johns Hopkins University.