Errata

Errata for Web Scraping with Python

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted by	Date Submitted
Printed	Page 178 paragraph begins... FreeGeoIP	URL is http: //freegeoip.net but should be http: //freegeoip.app	Kevin Brown	May 01, 2021
Printed	Page 88 Code example, line 8 from top of page	The format of the code is as follows: try: for row in rows: csvRow = [] for cell in row.findAll(['td', 'th']): csvRow.append(cell.get_text()) writer.writerow(csvRow) # this is the line with the error When run, this code writes the variable csvRow to the csv file for every cell found in the document. This results in a single row being written to the csv file 11 times, each time with one additional cell of information appended to the row. The code should be formatted as follows: try: for row in rows: csvRow = [] for cell in row.findAll(['td', 'th']): csvRow.append(cell.get_text()) writer.writerow(csvRow) # this is the line with the corrected error In this example the 6th line has been "untabbed" so that it is only run once the inner for loop has concluded. Not a big issue by any means, but it did cost me about 5 minutes of debugging when using that example as a basis for my own scraper!	Anonymous	Apr 12, 2021
Other Digital Version	1839 middle part	Hi :), I think instead of inherit from Website it should inherit from "Webpage". since this is initializing using the data from "Webpage": class Webpage: def __init__(self, name, url, titleTag): self.name = name self.url = url self.titleTag = titleTag class Product(Website): def __init__(self, name, url, titleTag, productNumberTag, priceTag): Website.__init__(self, name, url, TitleTag) self.productNumberTag = productNumberTag self.priceTag = priceTag class Article(Website): def __init__(self, name, url, titleTag, bodyTag, dateTag): Website.__init__(self, name, url, titleTag) self.bodyTag = bodyTag self.dateTag = dateTag NOTE: I BOUGHT THE KINDLE VERSION.	Daniel de Jesús Rosas Pérez	Feb 13, 2021
ePub	Page 1514 websites list	Two of the CSS selectors don't apply anymore. Also, the first URL redirected me to another one, so, I redid my CSS selector based on this new page. I've bought the kindle version of the book, so, I wasn't able to know what page exactly I am on, nevertheless, I am at position 1514. If you wish to search by the content, I will give you this (the content where I found the issue): websites = [] for ... While this new method might not seem remarkably simpler than writing a new Python function for each new website at first glance, imagine what happens when you go from a system with 4 website sources to a system with 20 or 200 sources. Each list of strings is relatively easy to write. It doesn’t take up much space. It can be loaded from	Daniel de Jesús Rosas Pérez	Feb 11, 2021