Errata

Errata for Machine Learning and Security

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version	Location	Description	Submitted by	Date Submitted
Other Digital Version	2309 Location 2309 of kindle version in ARIMA section below figure 3-2	In the kindle version, there is a link to PyFlux. This link seems to no longer be valid and goes to pyflux.com. On my first attempt, this landed me on an “update your flash” page that was likely malware, though subsequent attempts go to a boilerplate landing page.	Daniel	Jun 11, 2020
	Chapter 1 Where detailing what a true positive is	In the Safari Books Online version, the text states that a true positive for spam prediction is the following: True positive: predicted spam + actual ham The text should be: True positive: predicted spam + actual spam	Asa Freedman	Jan 28, 2019
	1 Labeling spam or ham code at `import os`	The example in the Safari Books Online edition leaves out the section of code where the spam_words and ham_words are compared against the X_test set of the code. The next paragraph goes into a confusion matrix about this non existent data. The whole code is included in the GitHub profile, though, which may/not be useful to someone attempting to type with the book.	Asa Freedman	Jan 28, 2019
Printed	Page 160 1st non-code paragraph, first sentence	Upfront I apologize for my pedantry on this but the book is about computer security and computer security people often care about pedantry. The first sentence on page 160 says: "We indeed find some references to the Unix su (super user) privilege escalation command..." Yet the su command does not stand for "super user", instead it stands for either "switch user" or "substitute user" depending on the flavor of UNIX one is on. That sentence will be better as: "We indeed find some references to the Unix su (substitute user) privilege escalation command..."	Michal Grochmal	Sep 10, 2018
Printed	Page 31 Code examples	According to the book it uses the following metrics methods: sklearn.metrics: - accuracy_score - confussion_matrix After the 3rd paragraph it says that to measure the accuracy score of the model that we created using LogisticRegression is: accuracy_score(y_pred, y_test) but according to sklearn docs: sklearn.metrics.accuracy_score(y_true, y_pred, normalize=True, sample_weight=None) being y_true the correct labels (in this case: y_test) and y_pred the values classified by the model (in this case: y_pred). So in order to correct the correct method usage is: accuracy_score(y_test, y_pred) Even though is does not affect the score rating (0.99992273816) it is important to not confuse the readers and enforce the proper usage of the API. Regards, - Miguel	Miguel Diaz	Apr 04, 2018
Printed	Page 26 3rd and 4th paragraph	In the 3rd paragraph the author talks about classifying "malicious" and "legitimate" traffic using a threshold of 6000 requests over a period of 5 minutes (over 6000 is malicious and lower than this is legitimate). But on the 4th paragraph it talks about the number "20" as that threshold and it should be 6000! I think it's importante to correct this for the sake of new readers. Regards, - Miguel	Miguel Díaz	Apr 04, 2018
Printed	Page 20 First line (Code example)	First line of Page 20 there is an example of python code that is incorrect. Original: if len(stems) < 2: continue Fix: if len(stems) < 2: continue I assume this is due the book printing software interpretation of language code, since < is < in HTML codification. I think it's important for the sake of people who are trying the code dirtectly from the book instead of download it from the original source[1] [1] https://github.com/oreilly-mlsec/book-resources/blob/master/chapter1/spam-fighting-lsh.ipynb Regards, - Miguel	MIguel Diaz	Apr 02, 2018
	Chapter 3, Decision Forests 1st paragraph	The two most common types of forests used in practice are decision forests and gradient-boosted decision trees The two most common types of forests used in practice are random forests and gradient-boosted decision trees	Henry	Feb 25, 2018
Printed	Page 21 4th paragraph	The text states: "(the argument random_state=123 is passed in for the sake of result reproducibility)" but the actual code uses 'random_state=2"	Mike Eriksson	Feb 23, 2018