Chapter 1. Tactics: The Emerging LLM Stack
In this chapter, we share best practices for the core components of the emerging LLM stack: prompting tips to improve quality and reliability, evaluation strategies to assess output, retrieval-augmented generation ideas to improve grounding, and more. We also explore how to design human-in-the-loop (HITL) workflows. While the technology is still rapidly developing, we hope these lessons, the by-product of countless experiments we’ve collectively run, will stand the test of time and help you build and ship robust LLM applications.
Prompting
We recommend starting with prompting when developing new applications. It’s easy to both underestimate and overestimate its importance. It’s underestimated because the right prompting techniques, when used correctly, can get us very far. It’s overestimated because even prompt-based applications require significant engineering around the prompt to work well.
Focus on Getting the Most Out of Fundamental Prompting Techniques
A few prompting techniques have consistently helped improve performance across various models and tasks: n-shot prompts plus in-context learning, chain-of-thought, and providing relevant resources.
The idea of in-context learning via n-shot prompts is to provide the LLM with a few examples that demonstrate the task and align outputs to our expectations. A few tips:
-
If n is too low, the model may over-anchor on those specific examples, hurting its ability to generalize. As a rule ...
Get What We Learned from a Year of Building with LLMs now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.