Book description
Publisher's Note: A new edition of this book is out now that includes working with GPT-3 and comparing the results with other models. It includes even more use cases, such as casual language analysis and computer vision tasks, as well as an introduction to OpenAI's Codex.
Key Features
- Build and implement state-of-the-art language models, such as the original Transformer, BERT, T5, and GPT-2, using concepts that outperform classical deep learning models
- Go through hands-on applications in Python using Google Colaboratory Notebooks with nothing to install on a local machine
- Test transformer models on advanced use cases
Book Description
The transformer architecture has proved to be revolutionary in outperforming the classical RNN and CNN models in use today. With an apply-as-you-learn approach, Transformers for Natural Language Processing investigates in vast detail the deep learning for machine translations, speech-to-text, text-to-speech, language modeling, question answering, and many more NLP domains with transformers.
The book takes you through NLP with Python and examines various eminent models and datasets within the transformer architecture created by pioneers such as Google, Facebook, Microsoft, OpenAI, and Hugging Face.
The book trains you in three stages. The first stage introduces you to transformer architectures, starting with the original transformer, before moving on to RoBERTa, BERT, and DistilBERT models. You will discover training methods for smaller transformers that can outperform GPT-3 in some cases. In the second stage, you will apply transformers for Natural Language Understanding (NLU) and Natural Language Generation (NLG). Finally, the third stage will help you grasp advanced language understanding techniques such as optimizing social network datasets and fake news identification.
By the end of this NLP book, you will understand transformers from a cognitive science perspective and be proficient in applying pretrained transformer models by tech giants to various datasets.
What you will learn
- Use the latest pretrained transformer models
- Grasp the workings of the original Transformer, GPT-2, BERT, T5, and other transformer models
- Create language understanding Python programs using concepts that outperform classical deep learning models
- Use a variety of NLP platforms, including Hugging Face, Trax, and AllenNLP
- Apply Python, TensorFlow, and Keras programs to sentiment analysis, text summarization, speech recognition, machine translations, and more
- Measure the productivity of key transformers to define their scope, potential, and limits in production
Who this book is for
Since the book does not teach basic programming, you must be familiar with neural networks, Python, PyTorch, and TensorFlow in order to learn their implementation with Transformers. Readers who can benefit the most from this book include experienced deep learning & NLP practitioners and data analysts & data scientists who want to process the increasing amounts of language-driven data.
Table of contents
- Preface
- Getting Started with the Model Architecture of the Transformer
-
Fine-Tuning BERT Models
- The architecture of BERT
-
Fine-tuning BERT
- Activating the GPU
- Installing the Hugging Face PyTorch interface for BERT
- Importing the modules
- Specifying CUDA as the device for torch
- Loading the dataset
- Creating sentences, label lists, and adding BERT tokens
- Activating the BERT tokenizer
- Processing the data
- Creating attention masks
- Splitting data into training and validation sets
- Converting all the data into torch tensors
- Selecting a batch size and creating an iterator
- BERT model configuration
- Loading the Hugging Face BERT uncased base model
- Optimizer grouped parameters
- The hyperparameters for the training loop
- The training loop
- Training evaluation
- Predicting and evaluating using the holdout dataset
- Evaluating using Matthews Correlation Coefficient
- The score of individual batches
- Matthews evaluation for the whole dataset
- Summary
- Questions
- References
-
Pretraining a RoBERTa Model from Scratch
- Training a tokenizer and pretraining a transformer
-
Building KantaiBERT from scratch
- Step 1: Loading the dataset
- Step 2: Installing Hugging Face transformers
- Step 3: Training a tokenizer
- Step 4: Saving the files to disk
- Step 5: Loading the trained tokenizer files
- Step 6: Checking resource constraints: GPU and CUDA
- Step 7: Defining the configuration of the model
- Step 8: Reloading the tokenizer in transformers
- Step 9: Initializing a model from scratch
- Step 10: Building the dataset
- Step 11: Defining a data collator
- Step 12: Initializing the trainer
- Step 13: Pretraining the model
- Step 14: Saving the final model (+tokenizer + config) to disk
- Step 15: Language modeling with FillMaskPipeline
- Next steps
- Summary
- Questions
- References
- Downstream NLP Tasks with Transformers
- Machine Translation with the Transformer
-
Text Generation with OpenAI GPT-2 and GPT-3 Models
- The rise of billion-parameter transformer models
- Transformers, reformers, PET, or GPT?
- It's time to make a decision
- The architecture of OpenAI GPT models
-
Text completion with GPT-2
- Step 1: Activating the GPU
- Step 2: Cloning the OpenAI GPT-2 repository
- Step 3: Installing the requirements
- Step 4: Checking the version of TensorFlow
- Step 5: Downloading the 345M parameter GPT-2 model
- Steps 6-7: Intermediate instructions
- Steps 7b-8: Importing and defining the model
- Step 9: Interacting with GPT-2
- Training a GPT-2 language model
- Context and completion examples
- Generating music with transformers
- Summary
- Questions
- References
- Applying Transformers to Legal and Financial Documents for AI Text Summarization
- Matching Tokenizers and Datasets
- Semantic Role Labeling with BERT-Based Transformers
- Let Your Data Do the Talking: Story, Questions, and Answers
- Detecting Customer Emotions to Make Predictions
- Analyzing Fake News with Transformers
-
Appendix: Answers to the Questions
- Chapter 1, Getting Started with the Model Architecture of the Transformer
- Chapter 2, Fine-Tuning BERT Models
- Chapter 3, Pretraining a RoBERTa Model from Scratch
- Chapter 4, Downstream NLP Tasks with Transformers
- Chapter 5, Machine Translation with the Transformer
- Chapter 6, Text Generation with OpenAI GPT-2 and GPT-3 Models
- Chapter 7, Applying Transformers to Legal and Financial Documents for AI Text Summarization
- Chapter 8, Matching Tokenizers and Datasets
- Chapter 9, Semantic Role Labeling with BERT-Based Transformers
- Chapter 10, Let Your Data Do the Talking: Story, Questions, and Answers
- Chapter 11, Detecting Customer Emotions to Make Predictions
- Chapter 12, Analyzing Fake News with Transformers
- Other Books You May Enjoy
- Index
Product information
- Title: Transformers for Natural Language Processing
- Author(s):
- Release date: January 2021
- Publisher(s): Packt Publishing
- ISBN: 9781800565791
You might also like
book
Transformers for Natural Language Processing - Second Edition
OpenAI's GPT-3, ChatGPT, GPT-4 and Hugging Face transformers for language tasks in one book. Get a …
book
Natural Language Processing with PyTorch
Natural Language Processing (NLP) provides boundless opportunities for solving problems in artificial intelligence, making products such …
book
Real-World Natural Language Processing
In Real-world Natural Language Processing you will learn how to: Design, develop, and deploy useful NLP …
book
Natural Language Processing with Transformers, Revised Edition
Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results …