Chapter 9. Context-Aware Reasoning Applications Using RAG and Agents

In this chapter, you will explore how to bring together everything you’ve learned so far to build context-aware reasoning applications. To do this, you will explore retrieval-augmented generation (RAG) and agents. You will also learn about frameworks called LangChain, ReAct, and PAL, which make RAG and agent workflows much easier to implement and maintain. Both RAG and agents are often key components of a generative AI application.

With RAG, you augment the context of your prompts with relevant information needed to address knowledge limitations of LLMs and improve the relevancy of the model’s generated output. RAG has grown in popularity due to its effectiveness in mitigating challenges such as knowledge cutoffs and hallucinations by incorporating dynamic data sources into the prompt context without needing to continually fine-tune the model as new data arrives into your system.

RAG can be integrated with off-the-shelf foundation models or with fine-tuned and human-aligned models specific to your generative use case and domain.

Note

RAG and fine-tuning can be used together. They are not mutually exclusive.

Next, some general guidance to consider when deciding which techniques should be applied. If access to external data or dynamic data is required, then RAG-based architectures can enable this without continuous fine-tuning, which would become cost prohibitive. Also, RAG-based techniques do not require much ...

Get Generative AI on AWS now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.