Errata

Web Scraping with Python

Errata for Web Scraping with Python, Third Edition

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted By Date submitted Date corrected
ePub
Page Regular Expressions and BeautifulSoup
re.compile('..\/img\/gifts/img.*.jpg')

The re.compile pattern benefits from being entered as a raw string.
As written, obtain an error message:
>>>
<>:7: SyntaxWarning: invalid escape sequence '\.'
<>:7: SyntaxWarning: invalid escape sequence '\.'
<<<
Add the lower case r in the following eliminates the error messages:
>>>
re.compile(r'\.\.\/img\/gifts/img.*\.jpg')
<<<

A raw string bonus is not needing to escape the file separator character.

I am using Python 3.12.1.
Thank you for your consideration.

Note from the Author or Editor:
on pg 67 in the PDF/print book, change the 7th line from:

{'src':re.compile('..\/img\/gifts/img.*.jpg')})

to
{'src':re.compile(r'..\/img\/gifts/img.*.jpg')})

Chris Clark  Jul 21, 2024  Nov 22, 2024
PDF
Page Page 43
middle paragraph

github link is not available

github.com/REMitchell/python-scraping/blob/master/Chapter01_BeginningToScrape.ipynb

Note from the Author or Editor:
Link should be github.com/REMitchell/python-scraping/blob/master/Chapter04_FirstWebScraper.ipynb

Sophia  Oct 18, 2024  Nov 22, 2024
PDF
Page 131
3rd paragraph

(where id equals 2) is WHERE id = 2 ?

Note from the Author or Editor:
Yes, the code string inside the parentheses should change as described.

Sophia  Oct 18, 2024  Nov 22, 2024
Printed, PDF
Page 246
Code Part

Hashtag Sign "#" Must be put before "Wait for preview reader to load". I also wonder why it is formatted like python code!

Mohammed Kamal Alsyd  Sep 29, 2024  Nov 22, 2024
PDF
Page 251
step 3.

“Add a new file” should be "Add files"?

Sophia  Oct 18, 2024  Nov 22, 2024
PDF
Page 253
middle paragraph

duplicated "involves many steps".

Sophia  Oct 18, 2024  Nov 22, 2024
PDF
Page 294
2nd paragraph

The Python Processing should be ” multiprocessing” ? (Keep code font)

Note from the Author or Editor:
Change to "The Python processing and multiprocessing modules create..."

processing / multiprocessing in code font

Sophia  Oct 18, 2024  Nov 22, 2024
PDF
Page 316
final paragraph

In "setting the source back to universal and using the
Web Scraper API credentials", should source and universal be in code font?

Note from the Author or Editor:
universal should be in code font, but source should not be.

Sophia  Oct 18, 2024  Nov 22, 2024