Building AI Intensive Python Applications

Book description

Master retrieval-augmented generation architecture and fine-tune your AI stack, along with discovering real-world use cases and best practices to create powerful AI apps

Key Features

  • Get to grips with the fundamentals of LLMs, vector databases, and Python frameworks
  • Implement effective retrieval-augmented generation strategies with MongoDB Atlas
  • Optimize AI models for performance and accuracy with model compression and deployment optimization
  • Purchase of the print or Kindle book includes a free PDF eBook

Book Description

The era of generative AI is upon us, and this book serves as a roadmap to harness its full potential. With its help, you’ll learn the core components of the AI stack: large language models (LLMs), vector databases, and Python frameworks, and see how these technologies work together to create intelligent applications.

The chapters will help you discover best practices for data preparation, model selection, and fine-tuning, and teach you advanced techniques such as retrieval-augmented generation (RAG) to overcome common challenges, such as hallucinations and data leakage. You’ll get a solid understanding of vector databases, implement effective vector search strategies, refine models for accuracy, and optimize performance to achieve impactful results. You’ll also identify and address AI failures to ensure your applications deliver reliable and valuable results. By evaluating and improving the output of LLMs, you’ll be able to enhance their performance and relevance.

By the end of this book, you’ll be well-equipped to build sophisticated AI applications that deliver real-world value.

What you will learn

  • Understand the architecture and components of the generative AI stack
  • Explore the role of vector databases in enhancing AI applications
  • Master Python frameworks for AI development
  • Implement Vector Search in AI applications
  • Find out how to effectively evaluate LLM output
  • Overcome common failures and challenges in AI development

Who this book is for

This book is for software engineers and developers looking to build intelligent applications using generative AI. While the book is suitable for beginners, a basic understanding of Python programming is required to make the most of it.

Table of contents

  1. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Conventions used
    6. Get in touch
    7. Download a free PDF copy of this book
  2. Chapter 1: Getting Started with Generative AI
    1. Technical requirements
    2. Defining the terminology
    3. The generative AI stack
      1. Python and GenAI
      2. OpenAI API
      3. MongoDB with Vector Search
    4. Important features of generative AI
      1. Why use generative AI?
      2. The ethics and risks of GenAI
    5. Summary
  3. Chapter 2: Building Blocks of Intelligent Applications
    1. Technical requirements
    2. Defining intelligent applications
      1. The building blocks of intelligent applications
    3. LLMs – reasoning engines for intelligent apps
      1. Use cases for LLM reasoning engines
      2. Diverse capabilities of LLMs
      3. Multi-modal language models
      4. A paradigm shift in AI development
    4. Embedding models and vector databases – semantic long-term memory
      1. Embedding models
      2. Vector databases
      3. Model hosting
    5. Your (soon-to-be) intelligent app
      1. Sample application – RAG chatbot
      2. Implications of intelligent applications for software engineering
    6. Summary
  4. Part 1: Foundations of AI: LLMs, Embedding Models, Vector Databases, and Application Design
  5. Chapter 3: Large Language Models
    1. Technical requirements
    2. Probabilistic framework
      1. n-gram language models
    3. Machine learning for language modelling
      1. Artificial neural networks
      2. Training an artificial neural network
    4. ANNs for natural language processing
      1. Tokenization
      2. Embedding
      3. Predicting probability distributions
    5. Dealing with sequential data
      1. Recurrent neural networks
      2. Transformer architecture
    6. LLMs in practice
      1. The evolving field of LLMs
      2. Prompting, fine-tuning, and RAG
    7. Summary
  6. Chapter 4: Embedding Models
    1. Technical requirements
    2. What is an embedding model?
      1. How do embedding models differ from LLMs?
      2. When to use embedding models versus LLMs
      3. Types of embedding models
    3. Choosing embedding models
      1. Task requirements
      2. Dataset characteristics
      3. Computational resources
      4. Vector representations
      5. Embedding model leaderboards
      6. Embedding models overview
      7. Do you always need an embedding model?
      8. Executing code from LangChain
    4. Best practices
    5. Summary
  7. Chapter 5: Vector Databases
    1. Technical requirements
    2. What is a vector embedding?
      1. Vector similarity
      2. Exact versus approximate search
      3. Measuring search
    3. Graph connectivity
      1. Navigable small worlds
      2. How to search a navigable small world
      3. Hierarchical navigable small worlds
    4. The need for vector databases
      1. How vector search enhances AI models
    5. Case studies and real-world applications
      1. Okta – natural language access request (semantic search)
      2. One AI – language-based AI (RAG over business data)
      3. Novo Nordisk – automatic clinical study generation (advanced RAG/RPA)
    6. Vector search best practices
      1. Data modeling
      2. Deployment
    7. Summary
  8. Chapter 6: AI/ML Application Design
    1. Technical requirements
    2. Data modeling
      1. Enriching data with embeddings
      2. Considering search use cases
    3. Data storage
      1. Determining the type of database cluster
      2. Determining IOPS
      3. Determining RAM
      4. Final cluster configuration
      5. Performance and availability versus cost
    4. Data flow
      1. Handling static data sources
      2. Storing operational data enriched with vector embeddings
    5. Freshness and retention
      1. Real-time updates
      2. Data lifecycle
      3. Adopting new embedding models
    6. Security and RBAC
    7. Best practices for AI/ML application design
    8. Summary
  9. Part 2: Building Your Python Application: Frameworks, Libraries, APIs, and Vector Search
  10. Chapter 7: Useful Frameworks, Libraries, and APIs
    1. Technical requirements
    2. Python for AI/ML
    3. AI/ML frameworks
      1. LangChain
      2. LangChain semantic search with score
      3. Semantic search with pre-filtering
      4. Implementing a basic RAG solution with LangChain
      5. LangChain prompt templates and chains
    4. Key Python libraries
      1. pandas
      2. PyMongoArrow
      3. PyTorch
    5. AI/ML APIs
      1. OpenAI API
      2. Hugging Face
    6. Summary
  11. Chapter 8: Implementing Vector Search in AI Applications
    1. Technical requirements
    2. Information retrieval with MongoDB Atlas Vector Search
      1. Vector search tutorial in Python
      2. Vector Search tutorial with LangChain
    3. Building RAG architecture systems
      1. Chunking or document-splitting strategies
      2. Simple RAG
      3. Advanced RAG
    4. Summary
  12. Part 3: Optimizing AI Applications: Scaling, Fine-Tuning, Troubleshooting, Monitoring, and Analytics
  13. Chapter 9: LLM Output Evaluation
    1. Technical requirements
    2. What is LLM evaluation?
      1. Component and end-to-end evaluations
    3. Model benchmarking
      1. Evaluation datasets
      2. Defining a baseline
      3. User feedback
      4. Synthetic data
    4. Evaluation metrics
      1. Assertion-based metrics
      2. Statistical metrics
      3. LLM-as-a-judge evaluations
      4. RAG metrics
      5. Human review
      6. Evaluations as guardrails
    5. Summary
  14. Chapter 10: Refining the Semantic Data Model to Improve Accuracy
    1. Technical requirements
    2. Embeddings
      1. Experimenting with different embedding models
      2. Fine-tuning embedding models
    3. Embedding metadata
      1. Formatting metadata
      2. Including static metadata
      3. Extracting metadata programmatically
      4. Generating metadata with LLMs
      5. Including metadata with query embedding and ingested content embeddings
    4. Optimizing retrieval-augmented generation
      1. Query mutation
      2. Extracting query metadata for pre-filtering
      3. Formatting ingested data
      4. Advanced retrieval systems
    5. Summary
  15. Chapter 11: Common Failures of Generative AI
    1. Technical requirements
    2. Hallucinations
      1. Causes of hallucinations
      2. Implications of hallucinations
    3. Sycophancy
      1. Causes of sycophancy
      2. Implications of sycophancy
    4. Data leakage
      1. Causes of data leakage
      2. Implications of data leakage
    5. Cost
      1. Types of costs
      2. Tokens
    6. Performance issues in generative AI applications
      1. Computational load
      2. Model serving strategies
      3. High I/O operations
    7. Summary
  16. Chapter 12: Correcting and Optimizing Your Generative AI Application
    1. Technical requirements
    2. Baselining
      1. Training and evaluation datasets
      2. Few-shot prompting
      3. Retrieval and reranking
      4. Late interaction strategies
      5. Query rewriting
    3. Testing and red teaming
      1. Testing
      2. Red teaming
    4. Information post-processing
    5. Other remedies
    6. Summary
  17. Appendix: Further Reading: Index
    1. Why subscribe?
  18. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Download a free PDF copy of this book

Product information

  • Title: Building AI Intensive Python Applications
  • Author(s): Rachelle Palmer, Ben Perlmutter, Ashwin Gangadhar, Nicholas Larew, Sigfrido Narváez, Thomas Rueckstiess, Henry Weller, Richmond Alake, Shubham Ranjan
  • Release date: September 2024
  • Publisher(s): Packt Publishing
  • ISBN: 9781836207252