Errata

Errata for Essential Math for Data Science

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted By	Date submitted	Date corrected
Printed	Page page 81 equation of PDF	In the equation of PDF, the square root of (2pi) should be in the denominator. Note from the Author or Editor:* Yes, this needs to be fixed. Correct formula: www.gstatic.com/education/formulas2/553212783/en/normal_distribution.svg	Siqi Li	Jul 06, 2022	Jun 13, 2024
	Page Chapter 2 Question 5	In question 5 "You flipped a coin 19 times and got heads 15 times and tails 4 times." In appendix B question 5 "from scipy.stats import beta heads = 8 tails = 2 p = 1.0 - beta.cdf(.5, heads, tails) print(p) # 0.98046875". heads should be 15 and tails 4 Note from the Author or Editor: Change question to ""You flipped a coin 10 times and got heads 8 times and tails 2 times. Do you think this coin has any good probability of being fair? Why or why not?	Mark Oliver	Jul 08, 2022	Jun 13, 2024
Printed	Page Chapter 1, p38, sidebar 3rd paragraph	Riemann Sums are spelled incorrectly as Reimann Sums	Rolf Würdemann	Nov 30, 2022	Jun 13, 2024
ePub	Page Chapter 3 The probability density function 1st paragraph	I looked up the formula and it should be (including some HTML for the formatting: <math display="block"> f(x;\mu,\sigma^2) = \frac{1}{\sigma\sqrt{2\pi}} e^{ -\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2 }. </math> This formula shows that the 2-Pi should be in the denominator.	Linda Pescatore	Jan 01, 2023	Jun 13, 2024
ePub	Page Confidence intervals Second paragraph and Figure 3.15	This section is about finding the 95% confidence level for the sample mean of 64.408 being in line with the population mean. The mean should be the center of the plot of the normal distribution, in Figure 3.15. However, the figure shows the mean as 18. I don't know where this number came from. Note from the Author or Editor: Yes, this needs to be fixed. Please use this image: imgur.com/a/4JPvv1k	Linda Pescatore	Jan 02, 2023	Jun 13, 2024
Other Digital Version	Chapter 2 - Conditional Probability and Bayes's Theorem Equation 8	When applying the Bayes Theorem for the coffee/cancer problem, the P(Coffee) should be in the denominator for P(Cancer\|Coffee). It is currently multiplying the numerator: P(Coffee\|Cancer)P(Coffee). The corrected version of the numerator should be P(Coffee\|Cancer)P(Cancer). Note from the Author or Editor: Yes, this needs to be fixed. Math latex is: P(\text{Cancer\|Coffee}) = \frac{P(\text{Coffee\|Cancer})*P(\text{Cancer})}{P(\text{Coffee})}	Tales Ishida	Apr 18, 2023	Jun 13, 2024
PDF	Page Chapter 5, page# 181 1st paragraph	Squared root is missing from the "Standard error of estimate" formula Note from the Author or Editor: Yes, this needs to be fixed. Math latex is: S_e = \sqrt{\frac{\sum{(y-\hat{y})^2}}{n-2}}	Muhammad Umar Amanat	May 04, 2023	Jun 13, 2024
Printed	Page Page # 31 Example: 1 - 24	A line is missing in the book due to which error "x is not defined appears". I added the line x = symbols('x') and it worked. I am talking about Book: Essential math for data science by Thomas Neild Page # 31 Example: 1 - 24 Note from the Author or Editor: Yes, this needs to be fixed. Code is: from sympy import * x = symbols('x') z = (x2 + 1)3 - 2 dz_dx = diff(z, x) print(dz_dx) # 6x(x2 + 1)2	Nanda	May 19, 2023	Jun 13, 2024
Printed	Page p96-102 on going through the examples	Both in PDF and ePub. This is more of a serious mathematical error/omission in hypothesis testing. p98 should be "H0: population mean =18 ". More importantly, in all the following Python codes and the examples, the standard deviation should have been divided by the square root of the sample size, e.g p98 Example 3-19 should be corrected to std_dev=1.5/sqrt(40). The same applies to Examples 3-20,3-21,3-22. The probabilities will be different and the conclusions of the tests as well. Otherwise, really love the book and the examples provided. thank you. Note from the Author or Editor: Initially this felt a little pedantic but I understand where they are coming from, considering I said "Past studies have shown that the mean recovery time for a cold is 18 days, with a standard deviation of 1.5 days, and follows a normal distribution.""I think we might be able to fix this by saying "The population of people with a cold has a mean recovery time of 18 days, with a standard deviation of 1.5 days, and follows a normal distribution."" I think that will make examples work from that point forward. stackoverflow.com/questions/48554161/why-divide-sample-standard-deviation-by-sqrtsample-size-when-calculating-z-sco www.youtube.com/watch?v=JQc3yx0-Q9E	larisa Seward	Dec 03, 2023	Jun 13, 2024
Printed	Page 3 Last paragraph	The order of operations paragraph omits the critical, but often forgotten, "from left to right." PEMDAS, while helpful, is only correct if one applies multiplication and division from left to right, and addition and subtraction from left to right. Omitting this renders the mnemonic unstable. The example on the following page says "The ordering of these two is swappable," which is technically incorrect. i.e. If you divide 25 by 5, you get 5, and if you multiply that by 2, you get 10, and if you subtract 4, you get the wrong answer. PEMDAS only works when multiplication and division is executed from left to right. Note from the Author or Editor: Sure. Follow the sentence "As a brief refresher, recall that you evaluate components in parentheses, followed by exponents, then multiplication, division, addition, and subtraction.""with another sentence: "After that order, operations are then performed left-to-right." We can also rid the sentence "The ordering of these two is swappable since division is also multiplication (using fractions)."	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 9, 10 p9: Headline of example; p10: description of graph	In the headline of example 1-8 and also in the description of the graph presented in Figure 1-3, the function x^2 + 1 is denoted as "exponential" function. To my knowledge, an exponential function is in the form y^x as e.g. exp(x), which can easily be generalized, while x^2 + 1 is a polynomial (function) - in this case a polynomial of second grade, also called a quadratic function. Note from the Author or Editor: Might be a little pedantic, but let's just say "function involving an exponent" rather than "exponential function."	Rolf Würdemann	Nov 30, 2022	Jun 13, 2024
Printed	Page 33 Integrals, 2nd paragraph	The Riemann Sum is speeled incorrectly as Reimann Sum	Christoph Jätz	Oct 02, 2022	Jun 13, 2024
ePub	Page 41 3rd paragraph	In reference to using probability with data and statistics, the last sentence of the third paragraph says "We will cover that in Chapter 4 on statistics and hypothesis testing." But hypothesis testing seems to appear in Chapter 3, around page 96. Note from the Author or Editor: Correct, it is addressed in Chapter 3 not Chapter 4.	Austin Smith	Oct 04, 2022	Jun 13, 2024
Printed	Page 43 Second paragraph	The sentence "Conversely, you can turn an odds into a probability"... and then an example is shown. The example is turning a probability into odds. Hence, the sentence should be re-written as, "Conversely, you can turn a probability into an odds." Note from the Author or Editor: Change exactly as advised.	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 53 The aside with bird	The sentence: "Turn to Appendix A to learn how to build the binomial distribution from scratch without scikit-learn." Should read: "Turn to Appendix A to learn how to build the binomial distribution from scratch without SciPy." Note from the Author or Editor: Change exactly as advised.	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 53 First paragraph	"We iterate each number of successes x" should read "We iterate each number of successes k." Note from the Author or Editor: Change exactly as advised.	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 69 Sidebar	The Straight Dope wasn't a publication of its own. It was a syndicated newspaper column started by the Chicago Reader. (Imagine there is a link here to the Wikipedia page about it.) Note from the Author or Editor: Confirmed. Maybe change to "But then a column in the Chicago Reader The Straight Dope posed an important question."	Andy Lester	Nov 14, 2022	Jun 13, 2024
PDF	Page 78 2nd paragraph	Printed: The standard deviation for a sample and mean are specified by s and σ, respectively. Should be: The standard deviation for a sample and population are specified by s and σ, respectively. Note from the Author or Editor: Change exactly as advised.	Stefan Vanli	Aug 24, 2023	Jun 13, 2024
Printed	Page 91 4	I might be wrong about it. From what I understood, Central Limit Theorem is saying that standard deviation of sample means (sampling standard deviation) is equal to the population standard deviation, divided by square root of sample size. Books is saying - "sample standard deviation" instead of "sampling standard deviation". Note from the Author or Editor: Haha, man so much semantics. Let's just change that "sample standard deviation" formula to this latex: \sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}	Yaroslav Skoryk	Jul 22, 2023	Jun 13, 2024
PDF	Page 95 Inside function code of "def confidence_interval(p, sample_mean, sample_std, n)"	Inside the confidence interval function lower_ci should be subtracted from sample_mean but it is added in sample_mean which is wrong. Corrected statement is return sample_mean - lower_ci, sample_mean + upper_ci Note from the Author or Editor: Change exactly as advised.	Muhammad Umar Amanat	Apr 04, 2023	Jun 13, 2024
Printed	Page 101 Last paragraph	The sentence "Since 16 is 4 days below the mean, we will also capture the area above 20, which is 4 days above the mean." The sentence should read "Since 16 is 2 days below the mean, we will also capture the area above 20, which is 2 days above the mean." Note from the Author or Editor: Change exactly as advised.	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 101 bottom paragraph	That paragraph says 16 is 4 days below the mean, but the mean is 18. 16 is 2 days below 18. It makes the same mistake in the other direction, saying 20 is 4 days above the mean. It's two. Note from the Author or Editor: Change exactly as advised.	Eric Osborne	Oct 03, 2023	Jun 13, 2024
PDF	Page 110 3rd paragraph	The y value of the vector should 260000, not 2600000, as the valuation figure used in the example if $260,000. Note from the Author or Editor: Change exactly as advised.	Kaushalya Samarasekera	Sep 10, 2022	Jun 13, 2024
Printed	Page 113 Figure 4-3	The graph of the three dimensional vector could be improved in my opinion. As, i, j, k in the image do not correlate to lengths 4, 1, 2. Also, an actual 3d graph would be nicer, since we are talking about 3 dimensions. Note from the Author or Editor: I'll revisit this and make a new graphic if we have time	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 114 Figure 4-4	In Figure 4.4, the numerical representations for both vectors include a negative x-value, when in fact the arrows represent positive x-values. The subsequent representations of the same vectors show positive x-values. Note from the Author or Editor: Corrected image here: https://i.imgur.com/sWa5lZ6.png	Anonymous	Nov 28, 2022	Jun 13, 2024
Printed	Page 117 Figure 4-8	The books states 0.5v = [3, 1.5]. When it should state 0.5v = [1.5,0.5] Note from the Author or Editor: Corrected image here: https://i.imgur.com/quLr0jo.png	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 124 First paragraph	In the first paragraph, "Shear" is not described, while all other transforms are described. Note from the Author or Editor: At end of first paragraph on page 124, add "A shear is easier to describe visually, but it displaces each point in a fixed direction proportionally to its distance from a given line parallel to that direction."	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 127 Figure 4-17	i_hat and j_hat values are should be the other way around in the right hand example for it to work as a visualisation of example 4-9. Note from the Author or Editor: Corrected image here: /i.imgur.com/njmVoze.png	Anonymous	Mar 08, 2023	Jun 13, 2024
Printed	Page 129 formula in 3rd paragraph under Matrix Multiplication section	In the 2x2 matrix multiplication, the term dy should be dg. Note from the Author or Editor: Change the "ce + dy" to "ce + dg"	Kirk Damron	Jun 30, 2022	Jun 13, 2024
PDF	Page 130 2nd code snippet	The variable should be named 'sheared' instead of 'sheered'. Note from the Author or Editor: Any instances of "sheer" should be "shear"	Kaushalya Samarasekera	Sep 11, 2022	Jun 13, 2024
Printed	Page 135 top graph	values for i_hat and j_hat are swapped. i_hat should be [3,-1.5] and j_hat should be [2, -1]. Note from the Author or Editor: "Change code in 4-16 to: from numpy.linalg import det from numpy import array i_hat = array([3, -1.5]) j_hat = array([-2, 1]) basis = array([i_hat, j_hat]).transpose() determinant = det(basis) print(determinant) # prints 0.0"	Eric Osborne	Oct 03, 2023	Jun 13, 2024
Printed	Page 137, 139, 140 Inverse matrix (A^-1)	The inverse matrix needs to have -4/3 in the right center, not -4/3. Also applies to repeated depictions of this matrix on pages Note from the Author or Editor: That 4/3 in the two instances on page 137 needs to be -4/3. Math latex for the first matrix should be: \left[\begin{matrix}- \frac{1}{2} & 0 & \frac{1}{3}\\\frac{11}{2} & -2 & - \frac{4}{3}\\-2 & 1 & \frac{1}{3}\end{matrix}\right] That also needs to be carried over into the second equation: \left[\begin{matrix}- \frac{1}{2} & 0 & \frac{1}{3}\\\frac{11}{2} & -2 & - \frac{4}{3}\\-2 & 1 & \frac{1}{3}\end{matrix}\right]\left[\begin{matrix}4 & 2 & 4\\5 & 3 & 7\\9 & 3 & 6\end{matrix}\right]= \left[\begin{matrix}1 & 0 & 0\\0 & 1 & 0\\0 & 0 & 1\end{matrix}\right]	Rpf	Mar 08, 2023	Jun 13, 2024
Printed	Page 151 Heading at bottom	The heading reads, "Basic Linear Regression with SciPy," when it should read "Basic Linear Regression with Scikit Learn." Note from the Author or Editor: Make it "Basic Linear Regression with scikit-learn"	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 160 Top of page	The top of page 160 should include the heading "Matrix Decomposition," as this page parallels the elaboration of techniques initially listed in the 3rd paragraph of page 157: Closed Form, Matrix Inversion. Matrix Decomposition, Gradient Descent." Note from the Author or Editor: Let's add a heading to the top of page 160 "Matrix Decomposition" if possible	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 183 formula of the margin of error	'x_0 + X_mean' part should be 'x_0 - X_mean' Note from the Author or Editor: change + to -	Siqi Li	Jul 19, 2022	Jun 13, 2024
Printed	Page 198 The minor heading	The heading "Using Scipy" should be "Using Scikit Learn." Note from the Author or Editor: Make it "Using scikit-learn"	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 198 Example 6-3	When we turn off the penalty for the logistic regression: FutureWarning: `penalty='none'`has been deprecated in 1.2 and will be removed in 1.4. To keep the past behaviour, set `penalty=None`. Note from the Author or Editor: Change code Example 6-3 to: import pandas as pd from sklearn.linear_model import LogisticRegression # Load the data df = pd.read_csv('https://bit.ly/33ebs2R', delimiter="","") # Extract input variables (all rows, all columns but last column) X = df.values[:, :-1] # Extract output column (all rows, last column) Y = df.values[:, -1] # Perform logistic regression # Turn off penalty model = LogisticRegression(penalty=None) model.fit(X, Y) # print beta1 print(model.coef_.flatten()) # 0.69267212 # print beta0 print(model.intercept_.flatten()) # -3.17576395	Maya	Jul 25, 2023	Jun 13, 2024
Printed	Page 199 Entire page	There are three references to SciPy when these references should be Scikit Learn, as they are different libraries. Note from the Author or Editor: Change Scipy to scikit-learn across all three instances	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 200 Bird aside text	The three instances of "Scipy" should be changed to "Scikit Learn." Note from the Author or Editor: Change Scipy to scikit-learn across all three instances	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 201 formula of joint likelihood	the second multiplier misses '1 - ' Note from the Author or Editor: That 1.0 - needs to be put in the second multiplier just like on page 202	Siqi Li	Jul 22, 2022	Jun 13, 2024
Printed	Page 213 formula of log likelihood	log is missing in the formula Note from the Author or Editor: A log() needs to be wrapped around each of the two multiplied expressions.	Siqi Li	Aug 02, 2022	Jun 13, 2024
Printed	Page 214 Example 6-13	The loglikelihood in the R^2 formula should be changed from -0.5596 to -14.341 Note from the Author or Editor: Replace in the R^2 formula as advised Also put a newline before that R^2 answer	Maya	Jul 25, 2023	Jun 13, 2024
Printed	Page 217 First formula	The parentheses around the subtraction of log likelihood fit and log likelihood is missing in the formula of the Chi Square value. Should be: chi_2 = 2 * ( (log likelihood fit) - (log likelihood) ) Note from the Author or Editor: Wrap parantheses around the whole expression following the "2"	Julian K.	Mar 13, 2023	Jun 13, 2024
Printed	Page 221 Figure 6-18	Negative predicted value should be TN/(TN+FN) instead of TN/(TP+FN) Note from the Author or Editor: This was a typo, so yes this should be corrected	Siqi Li	Aug 02, 2022	Jun 13, 2024
Printed	Page 221 figure 6-18	Author uses both sensitivity and recall in the figure, but fails to point out that they are the same thing. Note from the Author or Editor: In the figure, change the Sensitivity label to Sensitivity/Recall	RP	Mar 23, 2023	Jun 13, 2024
Printed	Page 222 End of code / Before Major heading	Author provides code to demonstrate a confusion matrix, but does not include the output of the final print statement (which should print out a confusion matrix), which is surprising, since all previous examples do show the output of the print statements. Note from the Author or Editor: In the code, add at the very bottom in a new line this snippet: """""" [[6 3] [4 5]] """""""	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 225 section of Class Imbalance	In the third paragraph, the author mentioned that using 'stratify' option in scikit-learn can duplicate samples in the minority class until it is equally represented in the dataset. However, that is not the function of stratified split. Using 'stratify' will retain the same class distribution in the train set and test set as in the original dataset, not duplicating the samples. Note from the Author or Editor: Yup, I misunderstood this parameter as I don't use it often. Change "You can do this in scikit-learn as shown in Example 6-19 when doing your train-test splits. Pass the stratify option with the column containing the class values, and it will attempt to equally represent the data of each class." to "Another technique is shown in Example 6-19 when doing your train-test splits. Pass the stratify option with the column containing the class values, and it will keep the class distribution consistent between the train-test split."	Siqi Li	Aug 02, 2022	Jun 13, 2024
Printed	Page 225 section of Class Imbalance	In the second paragraph, the author mentioned that ROC/AUC can be used when class is imbalanced, which might be misleading. ROC curves should be used when there are roughly equal numbers of observations for each class. Area under the Precision-Recall-Curve (PR-AUC) is more suitable for highly imbalanced data than ROC-AUC. Note from the Author or Editor: Let's delete "and ROC/AUC curves" from the sentence: "First, you can do obvious things like collect more data or try different models as well as use confusion matrices and ROC/AUC curves."	Siqi Li	Aug 02, 2022	Jun 13, 2024
Printed	Page 236 Figure 7-8	Figure shows Logistic function as an Activation Function of the Output Layer, while Neural Network suppose to solve classification problem of 10 classes (numbers 0 - 9). Note from the Author or Editor: Replace that "s-shaped" curve chart on the top-right with this figure: imgur.com/3LuEjJy	Yaroslav Skoryk	Aug 07, 2023	Jun 13, 2024
Printed, PDF	Page 242 3rd paragraph	In the 2nd line of the 3rd para it must be "dark (0)" instead of "dark (1)". Note from the Author or Editor: Change as advised	frank langenau	May 18, 2023	Jun 13, 2024
Printed	Page 244 Second major paragraph	The sentence "Let's focus on finding the relationship on a weight from the output layer W_2 and the cost function C." Should perhaps read something like: ""Let's focus on finding the relationship on a weight (W_2) from the second output layer and the cost function C." Note from the Author or Editor: Let's change it to: "Let's focus on finding the relationship on a weight (W_2) from the output layer and the cost function C	Daniel Caron	Aug 07, 2022	Jun 13, 2024
Printed	Page 245 example 7-9	it should say W2, not W1 Note from the Author or Editor: In that formula above the code example, that dZ2/dW1 should indeed be dZ2/dW2.	Anonymous	Mar 29, 2023	Jun 13, 2024
Printed	Page 252 First paragraph	The sentence "The activation argument specifies the hidden layer," should read something like, ""The activation argument specifies which activation function to apply to the nodes contained in the hidden layers." Note from the Author or Editor: Change as advised	Daniel Caron	Aug 07, 2022	Jun 13, 2024
PDF	Page 310 Appendix B, Chapter 2 Solutions, Exercise 2	The union probability answer that is given in the book is: (1 - 0.3) + 0.4 - (.03 x 0.4) = 0.98 There are two serious mistakes here: 1) In the second part of the calculation, the subtraction of the joint probability, the probability of rain is 30%, or 0.3, but the author is multiplying against .03, or 3%. The answer with the calculation provided ends up being 1.088, which is above 1.0, so it's incorrect. 2) Even if you correct for the percentage, the calculation is still incorrect. The calculation for the Union probability is P(A OR B) = P(A) + P(B) - P(A AND B). In this exercise, A stands for (NOT RAIN), which the author calculates correctly for the P(A) part, but in the joint probability he reverts to using P(RAIN) instead of P(NOT RAIN). The correct calculation should be: (1 - 0.3) + 0.4 - ((1 - 0.3) * 0.4) = 0.82 Note from the Author or Editor: Change as advised. That line should be: (1 - 0.3) + 0.4 - ((1 - 0.3) * 0.4) = 0.82 I don't know what I did there. It must have been late.	Fotis Koutoulakis	Sep 24, 2022	Jun 13, 2024