Skip to content
  • Sign In
  • Try Now
View all events
Search

Semantic Search with LLMs

Published by Pearson

Intermediate to advanced content levelIntermediate to advanced

Use LLMs and vector databases for context-aware searches with natural language queries

  • Get practical experience in semantic search algorithms
  • Go beyond theoretical knowledge to apply actionable skills to your projects
  • Learn to integrate state-of-the-art NLP frameworks, such as Hugging Face Transformers and vector search databases, for semantic search in real-world applications
  • Master complex concepts like quantization through this course designed to make advanced topics accessible and straightforward

Semantic search goes beyond traditional keyword-based queries to understand the context and intent behind a search, providing more relevant and accurate results. This course arms you with the tools and techniques to implement semantic search in diverse applications, from recommendation systems to query-answering systems. As the volume of digital data continues to grow, mastering semantic search will not only make your search algorithms more effective but also provide substantial benefits in data analytics and user engagement.

Using the capabilities of LLMs, this course expands their utility by allowing them to access a vector database full of text and metadata. A vector database essentially maps words into numerical spaces, enabling more nuanced search capabilities that consider context and semantics. The curriculum starts with the basics of transitioning from traditional keyword-based search algorithms to advanced semantic search methods. It then dives into practical exercises that will train you in state-of-the-art techniques, optimizing for speed, cost, and accuracy. Through these hands-on modules, we’ll work through real vector search challenges, allowing you to apply what you've learned in real-world applications.

What you’ll learn and how you can apply it

By the end of the live online course, you’ll understand:

  • Techniques for generating vectors from text
  • Best practices for using semantic search algorithms
  • Practical applications of vector search to solve real-world problems

And you’ll be able to:

  • Understand vector search algorithms such as HNSW
  • Utilize vector databases such as Qdrant, Elasticsearch, and LanceDB
  • Apply semantic search to real-world search challenges

This live event is for you because...

  • You work with LLMs but want to expand their capabilities by letting them access a database of text information
  • You want to upgrade a keyword-based search to semantic search but you’re not sure how
  • You’re curious about the state-of-the-art techniques to employ semantic search while optimizing speed, cost, and accuracy

Prerequisites

  • A working understanding of the foundational principles of deep learning and the basics of LLMs
  • Some prior experience using PyTorch or NumPy
  • Experience with Python

Course Set-up

Recommended Preparation

Recommended Follow-up

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Segment 1: Introduction to Large Language Models (45 minutes)

  • Overview of pre-trained language models and their applications
  • Explanation of key libraries
  • History of embedding creation and intro to context-aware embeddings
  • Overview of popular vector search databases
  • Q&A
  • Break

Segment 2: How Semantic Search Works (45 minutes)

  • Locality Sensitive Hashing (LSH)
  • Hierarchical Navigable Small World (HNSW)
  • Asymmetric search
  • Q&A
  • Break

Segment 3: Optimize Semantic Search Accuracy, Speed, and Cost (60 minutes)

  • Mixed precision
  • Quantizing
  • Reranking retrieved results with cross encoder
  • Generative AI

Segment 4: Summary and Q&A (10 minutes)

  • Summary of key takeaways from the class
  • Open Q&A session

Your Instructor

  • Shaan Khosla

    Shaan Khosla is a Senior Data Scientist at Nebula where he researches, designs, and develops NLP models. He previously worked at Bank of America on an internal machine learning consulting team, where he used LLMs to build proof of concept systems for various lines of business. Shaan holds a BSBA in Computer Science and Finance from the University of Miami and is currently completing a master’s degree in Data Science at NYU. He has published multiple peer-reviewed papers applying LLMs, topic modeling, and recommendation systems to the fields of biochemistry and healthcare.