The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".
The following errata were submitted by our customers and approved as valid errors by the author or editor.
Version |
Location |
Description |
Submitted By |
Date submitted |
Date corrected |
Printed |
Page xvii
3rd paragraph |
For some reason install.packages("nutshell")is not working. It fails to install the package. Please let me know if I am doing something wrong.
Note from the Author or Editor: CRAN deleted nutshell temporarily. It should be available again.
|
Anonymous |
Nov 27, 2010 |
|
PDF |
Page xvii
line 13 |
... available by from ... -> ... available from ...
|
Poni |
May 11, 2010 |
Jan 01, 2011 |
PDF |
Page 24
right at the top, the first example |
The example refers to "a list containing a number and string", but it looks to me like the said list contains a string and a string.
Note from the Author or Editor: That's correct; it should read "a list containing two strings"
|
gabkdlly |
Sep 08, 2011 |
|
Printed |
Page 26
Final paragraph |
This is about order of exposition. You use the 'cars' data set on p26, but where it comes from remains mysterious until you explain at bottom of p.30 that 'cars' is included within the R base package.
Note from the Author or Editor: Great catch; we must have switched around sections in editing.
The explanation for the "cars" data set from page 30 should be moved to 26.
|
Ewan Klein |
Apr 06, 2011 |
|
PDF |
Page 35
DK |
data(consumption) is required before using it
Original:
If you?d like, you can try plotting the relationship using the default settings:
> library(lattice)
> dotplot(Amount~Year|Food, consumption)
Should be:
If you?d like, you can try plotting the relationship using the default settings:
> data(consumption)
> dotplot(Amount~Year|Food, consumption)
Note from the Author or Editor: Correct. Full version should be:
> library(lattice)
> data(consumption)
> dotplot(Amount~Year|Food, consumption)
|
Attila Babo |
Aug 14, 2010 |
|
PDF |
Page 35
second paragraph |
The last word in the paragraph should be: packages (with an "s").
Note from the Author or Editor: Yes, the text should read "for directions on loading packages" not "for directions on loading package"
|
gabkdlly |
Sep 09, 2011 |
|
PDF |
Page 39
last paragraph, first sentence |
"open isource" -> "open source"
Note from the Author or Editor: Oops; type is in the first sentence of the last paragraph on page 39 (printed page)
|
gabkdlly |
Sep 09, 2011 |
|
Printed |
Page 44
Item list in middle of page under 'Depends' item |
Text specifies 'earth' package; example uses 'nnet' package.
Note from the Author or Editor: Good catch; the text should read "and the nnet package"
|
Matt Kizerian |
Sep 07, 2011 |
|
Printed |
Page 44
1st paragraph |
third line in 1st paragraph says
the most imporant
> the most important
Note from the Author or Editor: Thanks for correcting this typo
|
Francisco Murphy |
Aug 17, 2012 |
|
Printed |
Page 45
1st para |
in line 6
wrong: % R CMD CHECK nutshell
correct: % R CMD check nutshell
Note from the Author or Editor: The current help file indicates that "check" should be lowercase. This errata is correct; the command should read
% R CMD check nutshell
|
Anonymous |
Jun 04, 2010 |
Jan 01, 2011 |
PDF |
Page 49
1st paragraph, second sentence |
The end of the sentence and thus the meaning of the sentence is missing.
"It's usually a good idea to start by using the check command to make sure that."
Note from the Author or Editor: Ugh, we lost something in editing.
Please change " It?s usually a good idea to start by using the check command to make sure that." to "To make sure that the package complies with CRAN rules and builds correctly, use the check command."
|
rmsharp |
Oct 04, 2010 |
|
PDF |
Page 77
2/3 down just above R Code Style Standards header |
The code example reads:
> shopping.list[[c("dairy", "milk")]]
[1] "1 gallon"
However, the result is as follows:
> shopping.list[[c("dairy", "milk")]]
Error in shopping.list[[c("dairy", "milk")]] : no such index at level 1
They following provides the listed result, but I do not think it is what was intended.
> shopping.list[[1]][[1]]
[1] "1 gallon"
Note from the Author or Editor: There is a mistake here, but a different one than the one noted by the reporter.
On the top of page 82, shopping.list is incorrectly defined as:
# WRONG
> shopping.list <- list (dairy, fruit)
This should be defined as
# RIGHT
> shopping.list <- list (dairy=dairy, fruit=fruit)
This will then allow the rest of the example on the page to work correctly.
|
R. Mark Sharp |
Jul 30, 2010 |
Jan 01, 2011 |
Printed |
Page 84
last paragraph |
"Vectors are used to represent multidimensional data of a single type."
Probably "Arrays..." was intended rather than "Vectors...".
Note from the Author or Editor: Good catch. Please change the text to read "Array are used to represent multidimensional data of a single type."
|
Cedric Duprez |
Dec 10, 2010 |
|
Printed |
Page 85
1st graf, 2nd line |
"...matrices don't have an explicit class attribute."
Probably "...arrays..." was intended rather than "...matrices...".
Note from the Author or Editor: At the top of page 91 (in the PDF), within the Arrays section, I state
"Liked matrices, and unlike most other classes, matrices don't have an explicit class attribute"
this should be changed to
"Liked matrices, and unlike most other classes, arrays don't have an explicit class attribute"
|
Anonymous |
Jun 10, 2010 |
Jan 01, 2011 |
PDF |
Page 86
middle of page |
In Category: Vectors; Object type: raw; Example CharToRaw("Hello") should be charToRaw("Hello")
Note from the Author or Editor: good correction
we must have capitalized this incorrectly... please change to charToRaw
|
rmsharp |
Oct 06, 2010 |
|
PDF |
Page 88
Immediately after first paragraph of text |
Example has
> # a vector of four numbers
> v <- c(.295, .300, .250, .287, 215)
> v
[1] 0.295 0.300 0.350 0.287 0.215
However, this example has five numbers. I believe the illustration would fit with the rest of the page if the example was a follows.
> # a vector of four numbers
> v <- c(.295, .300, .250, .287)
> v
[1] 0.295 0.300 0.350 0.287
Note from the Author or Editor: Duh... the comment should read
# a vector of 5 numbers
|
rmsharp |
Oct 06, 2010 |
|
|
105-106
last example |
This error was reported via Safari books online. This is exactly what the reader provided:
Chapter 8.5.1
last example forgot to define the doNothing function
Note from the Author or Editor: Oops, left out the function. Here's the sample function:
doNothing <- function(x) {
message("This function does nothing.")
}
|
Anonymous |
Jul 13, 2010 |
Jan 01, 2011 |
PDF |
Page 109-110
last line on 109 and code on 110 |
There are semicolons at the end of most lines of code within this example. On page 83 in summarizing the Google-Styleguide the following directive was written: "Omit semicolons at the end of lines when they are optional.
Since they are optional in the code on pages 109 and 110, they should be removed.
Note from the Author or Editor: Yes, I think this is a valid point. It's not a serious mistake, but it would improve the book.
I should have stuck with the style guide an omitted the semicolons. (Sorry, old habit from other languages.)
|
R. Mark Sharp |
Jul 31, 2010 |
|
PDF |
Page 118
5th line from the bottom |
The text says "This family of functions is a good alternative to control structures.". However, there is no discussion or example of how they can be used as control structures. Is this topic covered elsewhere?
Note from the Author or Editor: "Control structures" mean language features like conditional statements, loops, and go-to statements.
Suppose that you had a vector of numerical values, and wanted to calculate the square of each element. You could do this using a loop:
> v <- 1:20
> w <- NULL
> for (i in 1:length(v)) {w[i] <- v[i]^2}
> w
[1] 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400
However, you can also write this using an "apply" statement like this:
> v <- 1:20
> w <- sapply(v, function(e) {e^2})
> w
[1] 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400
|
R. Mark Sharp |
Jul 31, 2010 |
|
Printed |
Page 120
code snippet two thirds of the way down |
Variable "my.TimeSeries" is referred to as "my.TimesSeries", with an extra 's'.
This mistake is repeated in the middle of p121.
Note from the Author or Editor: That is correct. We should change my.TimesSeries to my.TimeSeries wherever it occurs.
|
Steven J |
Jul 31, 2010 |
|
PDF |
Page 122
Last sentence in second paragraph in the Changes to Other Environments section |
"... then R will assign var to value in the global environment."
I think this should be "... then R will assign value to var in the global environment."
Note from the Author or Editor: Yes, this is a good catch. Change to "R will assign the value to var in the global environment.
|
rmsharp |
Oct 06, 2010 |
|
Printed |
Page 122
First code example |
attr(, "package") is wrong and probably was meant to say attr(period, "package")
Note from the Author or Editor: Good catch. Somehow we lost the word "period" in editing
|
Lars Francke |
Jun 21, 2011 |
|
Printed |
Page 127
Top section of the page |
In the example at the top of the page, the example slot function calls are incorrect. They should be
slot(birthdate, "month") <- "june"
and further down,
slot(birthdate, "month", check=FALSE) <- "june"
Note from the Author or Editor: Confirmed. We reversed "birthdate" and "month."
|
Colin Gillespie |
Apr 02, 2011 |
|
PDF |
Page 131
top paragraph |
In section "Old-School OOP in R: S3", "... you have to know how S3 classes they are implemented." should be corrected to "... you have to know how S3 classes are implemented." by removing "they" in the sentence.
Note from the Author or Editor: Good catch! I am always embarrassed by grammatical errors like this.
|
Anonymous |
Mar 02, 2010 |
Jan 01, 2011 |
PDF |
Page 132
top paragraph |
The top line "S3 classes lack a lot of the structure of S3 objects."
should be changed to ...
"S3 classes lack a lot of the structure of S4 objects."
Note from the Author or Editor: Change to "S3 classes lack the structure of S4 objects."
|
Joseph75010 |
May 27, 2011 |
|
Printed |
Page 133
First full paragraph |
A parenthetical begins, "A closely related function, NextMethod. NextMethod, is used in a method called..."
The first "NextMethod" and its period should be removed.
Note from the Author or Editor: I see this on page 141 of the PDF. Yes, this is a valid correction. Thanks!
|
Steven J |
Jul 31, 2010 |
Jan 01, 2011 |
PDF |
Page 134
Middle of page |
The paragraph* that introduces "initialize" is a bit unclear and would benefit from a little expansion.
*If you want to call a function after a new object is created, you may want to define and initialize method for the new object. ..."
Although clarified at the top of page 135, it is not clear here that the author is talking about running a function inside the class as a part of the construction process. I think an ordered list of events that occur during instantiation of a class object would be helpful.
Note from the Author or Editor: Good suggestion; I didn't describe that clearly.
Change "If you want to call a function after a new object is created, you may want to define an initialize method for the new object." to
"Whenever you create a new object, R will execute the initialize method of the class if the method is available. Programmers usually use the initialize method to calculate values and assign them to slots."
|
rmsharp |
Oct 07, 2010 |
|
PDF |
Page 135
Third to last paragraph; second paragraph in the Working with Objects section |
The function name for slotNames(o) is split across two lines as in slot
Names(o).
Note from the Author or Editor: Also on printed page 127
Please change this text so that slotNames(o) is not split across two lines.
|
rmsharp |
Oct 07, 2010 |
|
PDF |
Page 144
Bottom Section: Use Environments for Lookup Tables |
I often have large data sets and would like to use your suggestion of using the environment and an S4 class to implement the interface. However, I do not know how to do either. Please provide examples or a place to find examples.
Note from the Author or Editor: Sorry, didn't include a lot of information in the book. I wrote this article to explore how to do this:
http://broadcast.oreilly.com/2010/03/lookup-performance-in-r.html
|
rmsharp |
Oct 08, 2010 |
|
Printed |
Page 151
before last paragraph |
The load command only works for me if I included the .RData extension in the filename, ie.
load("~/top.5.salaries.RData")
Leaving off the extension I get this error
Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
In addition: Warning message:
In readChar(con, 5L, useBytes = TRUE) :
cannot open compressed file 'C:/Users/xxx/Documents/My R Files/R_in_a_Nutshell/top.5.salaries', probable reason 'No such file or directory'
Note from the Author or Editor: There are two errors on this page, actually
Please change
> save(top.5.salaries,
+ file="C:/Documents and Settings/me/My Documents/top.5.salaries.rda")
to read
> save(top.5.salaries,
+ file="C:/Documents and Settings/me/My Documents/top.5.salaries.Rdata")
Also please change
> load("~/top.5.salaries")
to read
> load("~/top.5.salaries.RData")
|
Anonymous |
Sep 22, 2010 |
|
PDF |
Page 153
Content of top.5.salaries.csv |
The header of the example csv file is missing a comma:
name.last,name.first,team position,salary
It should be:
name.last,name.first,team,position,salary
Note from the Author or Editor: Good catch. There should be a comma where indicated.
|
Anonymous |
Sep 22, 2011 |
|
PDF |
Page 155
code snippet, mid-page |
the paste function call appears to be missing the sep="" argument
Note from the Author or Editor: Good catch. (I'm not sure if we deleted the sep="" during editing, or if the defaults for paste changed.) Here's the correct code:
> sp500 <- read.csv(paste("http://ichart.finance.yahoo.com/table.csv?",
+ "s=%5EGSPC&a=03&b=1&c=1999&d=03&e=1&f=2009&g=m&ignore=.csv",sep=""))
|
Anonymous |
Mar 06, 2010 |
Jan 01, 2011 |
PDF |
Page 166
Bullet point 6 |
Path for the bb.db is wrong:
/Library/Frameworks/R.framework/Resources/library/nutshell/bb.db
Should be:
/Library/Frameworks/R.framework/Resources/library/nutshell/extdata/bb.db
Note from the Author or Editor: This is correct.
I had to move the path to comply with updated CRAN rules for packages.
|
Anonymous |
Sep 22, 2011 |
|
PDF |
Page 183
dow30.tickers snippet |
I had to rearrange the syntax a little (the "to" dates before the "from" dates in the get.quotes syntax on page 182). However, my real comment is that as of March 2010, "GM" is not a valid symbol string. I substituted "MTLQQ.PK"
So:
# define parts of the URL
(base <- "http://ichart.finance.yahoo.com/table.csv?")
(symbol <- paste("s=", ticker[1], sep=""))
# months are numbered from 00 to 11, so format the month correctly
(from.month <- paste("&g=d&a=",formatC(as.integer(format(from,"%m"))-1,width=2,flag="0"),sep=""))
(from.day <- paste("&b=", format(from,"%d"), sep=""))
(from.year <- paste("&c=", format(from,"%Y"), sep=""))
(to.month <- paste("&d=",formatC(as.integer(format(to,"%m"))-1,width=2,flag="0"),sep=""))
(to.day <- paste("&e=", format(to,"%d"), sep=""))
(to.year <- paste("&f=", format(to,"%Y"), sep=""))
(inter <- paste("&g=", interval, sep=""))
(last <- "&ignore=.csv")
# put together the URL
(url <- paste(base, symbol, to.month, to.day, to.year, from.month, from.day, from.year, last, sep=""))
# get the file
tmp <- read.csv(url);
and
dow.tickers <- c("MMM", "AA", "AXP", "T", "BAC", "BA", "CAT", "CVX", "C",
"KO", "DD", "XOM", "GE", "MTLQQ.PK", "HPQ", "HD", "INTC", "IBM",
"JNJ", "JPM", "KFT", "MCD", "MRK", "MSFT", "PFE", "PG",
"UTX", "VZ", "WMT", "DIS")
Really, this ends up being a commentary on the effect of "sea changes" even on fairly stable data sources and datasets.
Note from the Author or Editor: I'm not sure that the order matters for the "from" and "to" parts, but it is true that the stocks in the DJIA did change. (See http://www.djaverages.com/?view=industrial&page=overview for an authoritative list). The current list of tickers is:
dow.tickers <- c("MMM", "AA", "AXP", "T", "BAC", "BA", "CAT", "CVX",
"CSCO", "KO", "DD", "XOM", "GE", "HPQ", "HD", "INTC", "IBM",
"JNJ", "JPM", "KFT", "MCD", "MRK", "MSFT", "PFE", "PG", "TRV",
"UTX", "VZ", "WMT", "DIS")
|
Anonymous |
Mar 06, 2010 |
Jan 01, 2011 |
PDF |
Page 192
Definition of function get.quotes |
# months are numbered from 00 to 11, so format the month correctly
from.month <- paste("&a=",
formatC(as.integer(format(from,"%m"))-1,width=2,flag="0"),
sep="");
from.day <- paste("&b=", format(from,"%d"), sep="");
from.year <- paste("&c=", format(from,"%Y"), sep="");
---Here the argument should be "&d=", not "&a".
to.month <- paste("&a=",
formatC(as.integer(format(to,"%m"))-1,width=2,flag="0"),
sep="");
---
to.day <- paste("&e=", format(to,"%d"), sep="");
to.year <- paste("&f=", format(to,"%Y"), sep="");
inter <- paste("&g=", interval, sep="");
last <- "&ignore=.csv";
Note from the Author or Editor: Good catch.
Here is the full (corrected) code for clarity:
get.quotes <- function(ticker, from=(Sys.Date()-365), to=(Sys.Date()), interval="d") {
# define parts of the URL
base <- "http://ichart.finance.yahoo.com/table.csv?";
symbol <- paste("s=", ticker, sep="");
# months are numbered from 00 to 11, so format the month correctly
from.month <- paste("&a=",
formatC(as.integer(format(from,"%m"))-1,width=2,flag="0"),
sep="");
from.day <- paste("&b=", format(from,"%d"), sep="");
from.year <- paste("&c=", format(from,"%Y"), sep="");
to.month <- paste("&d=",
formatC(as.integer(format(to,"%m"))-1,width=2,flag="0"),
sep="");
to.day <- paste("&e=", format(to,"%d"), sep="");
to.year <- paste("&f=", format(to,"%Y"), sep="");
inter <- paste("&g=", interval, sep="");
last <- "&ignore=.csv";
# put together the url
url <- paste(base, symbol, from.month, from.day, from.year,
to.month, to.day, to.year, inter, last, sep="");
# get the file
tmp <- read.csv(url);
cbind(symbol=ticker,tmp);
}
|
Anonymous |
Dec 08, 2010 |
|
PDF |
Page 196
Last Paragraph |
Where it says "...create a matrix with four rows of five elements" shoud say "create a matrix with five rows of four elements", since is what it uses in the example.
Note from the Author or Editor: should be page 186 of the book
Yes, the numbers are reversed
|
SamuelM |
Aug 05, 2010 |
|
Printed |
Page 201
1st graf |
"When you call t on a vector, the vector is treated as a single row of a matrix." So, the value returned by t will be a matrix with a single column."
This is backwards. A vector is a single column:
> v
[1] 1 2 3 4 5 6 7 8 9 10
> dim(as.matrix(v))
[1] 10 1
And after t'ing, becomes a single row:
> t(v)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 2 3 4 5 6 7 8 9 10
> dim(as.matrix(t(v)))
[1] 1 10
Note from the Author or Editor: Yes, the reporter is correct. The text should read:
"When you call t on a vector, the vector is treated as a single column of a matrix. So the value returned by t will be a matrix with a single row."
|
Anonymous |
Jun 10, 2010 |
Jan 01, 2011 |
Printed |
Page 202
2nd paragraph |
Unless I'm missing something, I think the end of the paragraph should read
The left side of the formula represents the vector to be unstacked (in this case, Close). The right side indicates the groups to create (in this case, symbol).
Note from the Author or Editor: Correct; please change the sentence "The left side of the formula represents the vector to be unstacked (in this case, symbol). The right side indicates the groups to create (in this case Close)." to "The right side of the formula represents the vector to be unstacked (in this case, symbol). The left side indicates the groups to create (in this case Close).
|
Anonymous |
Sep 23, 2010 |
|
Printed |
Page 226
1st code example |
This code sample seems to have multiple typos, unless I'm missing something. The first argument to transform is batting.w.names.2008, this seems like it should be batting.2008. In the same expression, "bat=" seems like it should be "bats=" as the next command requires bats to be a factor. Finally, AVG doesn't exist--I assume it's H/AB.
Corrected code:
data(batting.2008)
​batting.w.names.2008 <- transform(batting.2008, bats=as.factor(bats), throws=as.factor(throws))
batting.w.names.2008$AVG <- batting.w.names.2008$H/batting.w.names.2008$AB
cdplot(bats~G_batting, data=batting.w.names.2008, subset=(batting.w.names.2008$AB>100))​
Note from the Author or Editor: I think this looks correct. I believe that batting.w.names.2008 is defined elsewhere in the book, but I can't find it in 2 minutes (so it's not fair to expect readers to find it). Here's a corrected example:
> batting.w.names.2008 <- transform(batting.2008,
+ AVG=H/AB, bats=as.factor(bats), throws=as.factor(throws))
> cdplot(bats~AVG,data=batting.w.names.2008,
+ subset=(batting.w.names.2008$AB>100))
|
Anonymous |
Mar 19, 2010 |
|
PDF |
Page 232
The bottom of the page |
Dim of the data is 562 253. Still the book says "select all 253 rows in reverse order" while the author selects columns first, and then rows.
> # load the data:
> library(nutshell)
> data(yosemite)
> # check dimensions of data
> dim(yosemite)
[1] 562 253
> # select all 253 rows in reverse order
> yosemite.flipped <- yosemite[,seq(from=253,to=1)]
Note from the Author or Editor: Oops, those are columns. Corrected in the latest draft.
|
Anonymous |
Jul 18, 2012 |
|
PDF |
Page 237
Bottom code example |
I believe "acres.harvested" should be replaced with "domestic.catch.2006".
Note from the Author or Editor: Oops, changed the example while writing, but left in the old code. The example should have read
> pie(domestic.catch.2006, init.angle=100, cex=.6)
|
Anonymous |
Oct 18, 2010 |
|
Printed |
Page 257
Bottom--abline examples |
In the fourth comment, "y-intercept" is misspelled as "y-inctercept."
Note from the Author or Editor: Good find
Please change "y-inctercept" to "y-intercept"
|
Jeffrey Tyzzer |
Jan 31, 2011 |
|
Printed |
Page 265
Above Figure 15-1 |
Figure 15-2, Simple scatter plot with conditioning variable is ostensibly generated with the following statement:
> package(lattice)
> xyplot(y~x|z, data=d)
This code should instead read:
> library(lattice)
> xyplot(y~x|z, data=d)
Note from the Author or Editor: This is correct. Thanks for catching this!
|
Anonymous |
Mar 22, 2010 |
Jan 01, 2011 |
Printed |
Page 274
Final code block |
The block begins with the comment:
# code to create file: remove from final version of book
It goes on to include code to generate Figure 15-7 for the book, rather than just the example.
Note from the Author or Editor: The report is correct. This text should have been cut from the book. It's not wrong, just not distracting and not helpful
|
Brendan Hickey |
Jun 29, 2010 |
Jan 01, 2011 |
Printed |
Page 302
First text paragraph |
To demonstrate the levelplot function, we are asked to use the latitude and longitude variables. However as far as I can tell, although these variables are described on p291, they have been replaced in the included data set (from the nutshell package) with the single variable neighborhood. Hence we are unable to work through the examples on p302 and 303.
Note from the Author or Editor: Good catch.
I will fix this by uploading an updated data set to CRAN.
|
Kendra Vant |
Mar 07, 2011 |
|
Printed |
Page 325
3rd paragraph |
stem(subset(field.goals,PLTYPE=="FG no")$YARDS) should be stem(subset(field.goals,play.type=="FG no")$yards) in order to work with the data set
Note from the Author or Editor: Good catch; changing to the correct code in the second edition.
|
patrick |
Oct 12, 2011 |
|
Printed |
Page 334
2nd paragraph, code |
I typed the example code for bootstrapping, and R was unable to find the data set home.sale.prices.june2008.
Could you clarify where this data can be obtained?
Note from the Author or Editor: The nutshell package was removed from CRAN. I am updating and correcting the package now.
|
Kristin Mull |
Aug 04, 2011 |
|
Printed |
Page 336
Bottom of first code block |
The last four lines of the code block should surely read
> # what is the probability that the value is within
> # 1.96 standard deviations of the mean?
> pnorm(1.96,lower.tail=TRUE) - pnorm(-1.96,lower.tail=TRUE)
[1] 0.9500042
A sanity check tells us that the probability that a value falls WITHIN 1.96 standard deviations of the mean should be a big number not a small one!
Note from the Author or Editor: Good catch.
The code above is correct
|
Kendra Vant |
Mar 10, 2011 |
|
PDF |
Page 344
5th paragraph ("Here's an explanation of the output..." |
The text reads: "The p-value means that the probability
that the mean value from an actual sample was higher than 10.812 (or lower than 7.288) was 0.4684. This means that it is very likely that the true mean time to failure was 10."
To the best of my understanding there is a mix of a few typos and a conceptual mistake here:
1) ".. was higher than 10.812" must read "... was higher than 10.182" instead (i.e. probability that the mean of the sample randomly drawn from normal with mu=9 would be higher than the mean of the sample at hand, which is 10.182).
2) "...(or lower than 7.288)" must read "...(or lower than 7.818)". Indeed, the observed average 10.182 is 1.182 *higher* than the postulated mu=9; since we are running two-sided t-test, we also account for samples with mean lower than 9 by the same amount, i.e. lower than 9-1.182=7.818
3) The last sentence: "... likely that the true mean time to failure was 10" was probably intended to read "... likely that the true mean time to failure was 9" since mu=9 was the null hypothesis we were testing against.
4) But there is actually a conceptual mistake here ( 3) above). We observed a sample with mean 10.182. How can this make mu=9 likely??? In the context of hypothesis testing and t-test in particular, null is *postulated*. The testing is not symmetric, in a sense that if we observe a highly unusual value (e.g. something with mean 28 instead of postulated mu=9), then we can *reject* the null. But if we observe something consistent with the null (as it is the case with the example at hand: p-value ~0.4 tells us that we could easily get the sample with mean 10.182 or even greater from normal with mu=9), this does not make the null (more) likely. We just can not reject it. For all we know, null could be still wrong, but our sample is just too small. Maybe if we got more data we would be still able to reject the null. So the last sentence should read, I believe, something like: "This means that the observed sample does not give us any basis to reject our null assumption that the mean time to failure is 9".
credit: discovered by my student
Note from the Author or Editor: Here are the changes to make to correct the text:
(a) Change the sentence "The p-value means that the prob- ability that the mean value from an actual sample was higher than 10.812 (or lower than 7.288) was 0.4684" to read "The p-value means that the prob- ability that the mean value from an actual sample was higher than 10.182 (or lower than 7.818) was 0.4684"
(b) Remove the sentence "This means that it is very likely that the true mean time to failure was 10."
(c) In the last paragraph, after the sentence "Next, the t.test function shows the 95% confidence intervals for this test, and, finally, it gives the actual mean." add the line "As a statistician would say, this evidence does not imply that the true mean was not equal to 9."
|
Andrey Sivachenko |
Jun 05, 2011 |
|
Printed |
Page 344
second para |
Here and through the text, the author equates the "null" hypothesis with the "alternative" hypothesis. These are not equivalents, but opposites.
Note from the Author or Editor: I checked the text for places where I confused "null hypothesis" and "alternative hypothesis." Heres' the list of corrections:
- On page 344, change the sentence "One common question is to ask if the mean of the experimental data is different from what it would be if the hypothesis was not true (which is called the null hypothesis or the alternative hypothesis)." to "One common question is to ask if the mean of the experimental data is close to what the experimenter expected; this is called the null hypothesis. Alternately, the experimenter may calculate the probability that an alternative hypothesis was true."
- On page 362, change the sentence "The alternative (or null) hypothesis is that the two vari- ables are not independent." to "The alternative hypothesis is that the two variables are not independent.
These are serious technical mistakes, but luckily they're isolated to a couple instances.
|
Anonymous |
Feb 13, 2010 |
Mar 01, 2010 |
Printed |
Page 345
throughout |
Text in the first graf, "Suppose we thought... tires of type H should last... 8 hours until failure" doesn't jibe with the code example, wherein "mu=9", or text in the last graf, "true mean is not equal to 9."
Note from the Author or Editor: Good catch.
The text should be changed from:
"Suppose that we thought, a priori, that tires of type H should last for approximately 8 hours until failure."
to
"Suppose that we thought, a priori, that tires of type H should last for approximately 9 hours until failure."
|
Anonymous |
Jun 29, 2010 |
Jan 01, 2011 |
Printed |
Page 355
355 |
On the top of page 355, I think there is a misinterpretation of the p-value from the Shapiro-Wilk normality test.
A p-value < 0.05 indicates that the data is likely not normally distributed. Therefore the sentence at the top should say:
"As you can see from the p-value, its quite likely that this data is *not* normally distributed."
Note from the Author or Editor: The test is covered on pages 376-377. This errata is correct; a p value < .05 indicates that the data is not normally distributed. Therefore, the text should read,
"As you can see from the p-value, it is quite likely that this data is not normally distributed."
|
Ethan Cerami |
Apr 05, 2010 |
Jan 01, 2011 |
Printed |
Page 356
1st paragraph |
For the Kolmogorov-Smirnov test, a significant p-value suggests that the distributions are different, not the same (as stated in the text).
"the two vectors likely came from the same distribution" should be "the two vectors likely did not come from the same distribution".
Note from the Author or Editor: Please change the text from
"According to the p-value, the two vectors likely came from the same distribution (at a 95% confidence level)."
to
"The p-value of this test was 0.0168, which is much less that .05. So, this test implies that it was not likely that these two samples came from the same distribution."
|
Andrew Walsh |
Apr 06, 2011 |
|
Printed |
Page 357
heading |
Use of the term "distribution-free tests" is archaic. These are non-parametric tests, which term is used once later in the text. You will just confuse readers.
Note from the Author or Editor: I think this error might be valid. Peter Dalgaard uses the term "distribution-free tests" in the book "Introductory Statistics with R." The term "non-parametric" may be preferable. It is used in
- a section header on page 357
- in the sentence "A good alternative to the tests described in ?Normal Distribution-Based Tests? on page 344 are distribution-free tests."
- in the sentence "The Wilcoxon test is the distribution-free equivalent to the t-test:"
- in the line "The Kruskal-Wallis rank sum test is a distribution-free equivalent to ANOVA analysis:"
- In the header "Distribution-Free Tabular Data Tests" on page 368
- In the line "The Friedman rank sum test is a distribution-free relative of the chi-squared test. In R, this is implemented through the friedman.test function:" on page 368
In each case, I believe that we should replace "distribution-free" with "non-parametric"
|
Anonymous |
Feb 13, 2010 |
Mar 01, 2010 |
Printed |
Page 364
at the bottom of the page, section 'Tabular Data Tests' |
Instead of 5325 there should be 5326 in the numerator.
Note from the Author or Editor: This is correct; the code should read:
> 5326 / (5326 + 6067)
[1] 0.46748
|
Rok Krsmanc |
Aug 03, 2011 |
|
PDF |
Page 368
first paragraph |
The statement
"The Friedman rank sum test is a distribution-free relative of the chi-squared test."
is not quite correct. It should say
"The Friedman rank sum test is a non-parametric counterpart to two way ANOVA tests."
|
Joseph Adler |
May 07, 2010 |
Jan 01, 2011 |
Printed |
Page 378-379
throughout |
Apparent editing or cut/paste problem -- the 3 grafs on page 377 beginning "To get the list of coefficients for the model..." is repeated verbatim on pages 378-379.
Note from the Author or Editor: This is correct. The text on the bottom of page 378 (printed book) starting with "To get the list of coefficients for a model object..." and ending with "Alternatively, you can use the alias coefficients to access the coef function" on the top of page 379 should all be removed.
|
Anonymous |
Jun 29, 2010 |
Jan 01, 2011 |
Printed |
Page 381
middle/ bottom |
List of plot outputs in bullet-point format on bottom 1/3 of page isn't in the same order as "caption=..." list in the code sample above.
Note from the Author or Editor: The author is correct. The bullet list would be more clear if it matched the order shown in the function definition.
|
Anonymous |
Jun 29, 2010 |
Jan 01, 2011 |
Printed |
Page 385
in the middle of the page, section 'Assumptions of Least Squares Regression' |
Function name 'ncv.test' is deprecated. Use 'ncvTest' instead.
Note from the Author or Editor: Yes, it appears that the library was changed. This was correct at the time of writing.
|
Rok Krsmanc |
Aug 03, 2011 |
|
PDF |
Page 385
3rd and 4th paragraphs |
All of these
> hpi.lm <- lm(Index~Year,data=shiller.index)
> hpi.rlm <- rlm(Index~Year,data=shiller.index)
> hpi.lqs <- lqs(Index~Year,data=shiller.index)
Should be
> hpi.lm <- lm(Real.Home.Price.Index~Year,data=schiller.index)
> hpi.rlm <- rlm(Real.Home.Price.Index~Year,data=schiller.index)
> hpi.lqs <- lqs(Real.Home.Price.Index~Year,data=schiller.index)
And then
> plot(hpi,pch=19,cex=0.3)
Should be
> plot(schiller.index,pch=19,cex=0.3)
Note from the Author or Editor: Good catch; I cleaned up the sample data while writing the book, but didn't fix all the examples. Thanks for sending this in!
|
Maksim Djackov |
Aug 16, 2011 |
|
PDF |
Page 387, 402-404
middle, bottom |
We misspelled Robert Shiller's name. (In the book, it is spelled as "Schiller," but should be "Shiller."
Errors are also found on page 402, 403, and 404.
|
Joseph Adler |
Mar 22, 2010 |
Jan 01, 2011 |
PDF |
Page 590
5th entry from top |
The definition for the "timestamp" function is incorrect; it should be "Writes a timestamp (or other message) into the history and echos it to the console."
|
Joseph Adler |
Mar 13, 2010 |
Jan 01, 2011 |