Errata

Blueprints for Text Analytics Using Python

Errata for Blueprints for Text Analytics Using Python

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted by Date submitted
ePub Page Chapter 8
Entire chapter

Only first 5 paragraphs of chapter are available. You cannot navigate to other sections of the chapter either.

Anonymous  Oct 17, 2023 
Printed Page 7
Last paragraph

On page 7, in the section "Plotting Value Distributions", the authors show a boxplot with outliers on the right-hand side and proceed to claim that "The distribution is clearly left-skewed".

The distribution is actually Right-Skewed, which is confirmed on the histogram on page 8.

Walter Ullon  Mar 11, 2022 
Printed Page 7
third line from bottom

The text reads, "The distribution is obviously left-skewed."
However, the distribution is in-fact right-skewed (the mean is to the left of the median)

Armin R. Mikler  Apr 25, 2023 
Printed Page 11
4th paragraph

Hi,

I am using your book "Blueprints for Text Analytics Using Python“ for programming (which is really great) and I have a question regarding chapter 1, „Blueprint: Building a Simple Text Preprocessing Pipeline“.

When I run the following code, I get an error:

PastedGraphic-1.png

The error is: „bad escape \p at position 6“.

This error is new. I run the code a few time in the past and there was no problem. I run it again (without any changes) and it doesn’t work anymore. I use Kaggle notebook for programming and I didn’t change the environment. I still use my original environment (2021-07-07).

I would be very glad if you have any advise!

Thanks in advance and kind regards

Christine

Christine Bach  Sep 08, 2021 
Printed Page 114, Blueprint: Extracting Noun Phrases
Code in top panel

There's an error in the code.
spans = textacy.extract.matches(doc, patterns=patterns) returns a error.

It should be:
spans = textacy.extract.matches.token_matches(doc, patterns=patterns)

Arunima Choudhury  Aug 01, 2022 
Printed Page 255
Figure 9-2

Figure 9-2 which shows one iteration of the PageRank algorithm, has incorrect entries in the output vector.

The text shows: [0.5, 0, 0, 1.5, 0.5, 1.5], when it should be [0.5, 0, 0, 1, 1, 1.5]. It appears that the dot product was miscalculated.

Walter Ullon  May 26, 2022 
Printed Page 272
3rd paragraph

The analogy: v(woman) + [v(king) - v(man)], is explained in the text as "what is to king like woman is to man"

I believe the analogy above should actually read: "what is to woman like king is to man" (for which the answer is "queen"). This is the canonical example for word2vec embeddings as shown in various sources, including the one cited by the authors.

Walter Ullon  May 26, 2022