Chapter 3. Moving Toward Chat

In the previous chapter, you learned about generative pre-trained transformer architecture. The way that these models are trained drastically influences their behavior. A base model, for example, has merely gone through the pre-training process—it has been trained on billions of arbitrary documents from the internet, and if you prompt a base model with the first half of a document, it will generate a plausible-sounding completion for that document. This behavior alone can be quite useful—and throughout this book, we will show how you can “trick” such a model into accomplishing all sorts of tasks besides pure document completion.

However, for a number of reasons, base models can be difficult to use in an application setting. For one thing, because it’s been trained on arbitrary documents from the internet, the base model is equally capable of mimicking both the light side and dark side of the internet. ...

Get Prompt Engineering for LLMs now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.