5

Diving into Fine-Tuning through BERT

In Chapter 3, Emergent vs Downstream Tasks: The Unseen Depths of Transformers, we explored tasks through pretrained models that could perform efficiently. However, in some cases, a pretrained model will not produce the desired outputs. We can pretrain a model from scratch, as we will see in Chapter 6, Pretraining a Transformer from Scratch through RoBERTa. However, pretraining a model can require a large amount of machine, data, and human resources. The alternative can be fine-tuning a transformer model.

This chapter will dive into fine-tuning transformer models through a Hugging Face pretrained BERT model. By the end of the chapter, you should be able to fine-tune other Hugging Face models such as GPT, ...

Get Transformers for Natural Language Processing and Computer Vision - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.