Using LXML, XPath, and CSS Selectors

So far, we have learned about web-development technologies, data-finding techniques, and accessing web content using the Python programming language.

Web-based content exists in parts or elements using some predefined document expressions. Analyzing these parts for patterns is a major task for processing convenient scraping. Elements can be searched and identified with XPath and CSS selectors that are processed with scraping logic for required content. lxml will be used to process elements inside markup documents. We will be using browser-based development tools for content reading and element identification.

In this chapter, we will learn the following:

  • Introduction to XPath and CSS selectors
  • Using browser ...

Get Hands-On Web Scraping with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.