Errata

R in a Nutshell

Errata for R in a Nutshell

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted By Date submitted Date corrected
Printed
Page xvii
3rd paragraph

For some reason install.packages("nutshell")is not working. It fails to install the package. Please let me know if I am doing something wrong.

Note from the Author or Editor:
CRAN deleted nutshell temporarily. It should be available again.

Anonymous  Nov 27, 2010 
PDF
Page xvii
line 13

... available by from ... -> ... available from ...

Poni  May 11, 2010  Jan 01, 2011
PDF
Page 24
right at the top, the first example

The example refers to "a list containing a number and string", but it looks to me like the said list contains a string and a string.

Note from the Author or Editor:
That's correct; it should read "a list containing two strings"

gabkdlly  Sep 08, 2011 
Printed
Page 26
Final paragraph

This is about order of exposition. You use the 'cars' data set on p26, but where it comes from remains mysterious until you explain at bottom of p.30 that 'cars' is included within the R base package.

Note from the Author or Editor:
Great catch; we must have switched around sections in editing.

The explanation for the "cars" data set from page 30 should be moved to 26.

Ewan Klein
Ewan Klein
 
Apr 06, 2011 
PDF
Page 35
DK

data(consumption) is required before using it

Original:

If you?d like, you can try plotting the relationship using the default settings:
> library(lattice)
> dotplot(Amount~Year|Food, consumption)

Should be:

If you?d like, you can try plotting the relationship using the default settings:
> data(consumption)
> dotplot(Amount~Year|Food, consumption)

Note from the Author or Editor:
Correct. Full version should be:

> library(lattice)
> data(consumption)
> dotplot(Amount~Year|Food, consumption)

Attila Babo  Aug 14, 2010 
PDF
Page 35
second paragraph

The last word in the paragraph should be: packages (with an "s").

Note from the Author or Editor:
Yes, the text should read "for directions on loading packages" not "for directions on loading package"

gabkdlly  Sep 09, 2011 
PDF
Page 39
last paragraph, first sentence

"open isource" -> "open source"

Note from the Author or Editor:
Oops; type is in the first sentence of the last paragraph on page 39 (printed page)

gabkdlly  Sep 09, 2011 
Printed
Page 44
Item list in middle of page under 'Depends' item

Text specifies 'earth' package; example uses 'nnet' package.

Note from the Author or Editor:
Good catch; the text should read "and the nnet package"

Matt Kizerian  Sep 07, 2011 
Printed
Page 44
1st paragraph

third line in 1st paragraph says
the most imporant
> the most important

Note from the Author or Editor:
Thanks for correcting this typo

Francisco Murphy  Aug 17, 2012 
Printed
Page 45
1st para

in line 6

wrong: % R CMD CHECK nutshell

correct: % R CMD check nutshell

Note from the Author or Editor:
The current help file indicates that "check" should be lowercase. This errata is correct; the command should read

% R CMD check nutshell

Anonymous  Jun 04, 2010  Jan 01, 2011
PDF
Page 49
1st paragraph, second sentence

The end of the sentence and thus the meaning of the sentence is missing.
"It's usually a good idea to start by using the check command to make sure that."

Note from the Author or Editor:
Ugh, we lost something in editing.

Please change " It?s usually a good idea to start by using the check command to make sure that." to "To make sure that the package complies with CRAN rules and builds correctly, use the check command."

rmsharp  Oct 04, 2010 
PDF
Page 77
2/3 down just above R Code Style Standards header

The code example reads:
> shopping.list[[c("dairy", "milk")]]
[1] "1 gallon"

However, the result is as follows:

> shopping.list[[c("dairy", "milk")]]
Error in shopping.list[[c("dairy", "milk")]] : no such index at level 1

They following provides the listed result, but I do not think it is what was intended.

> shopping.list[[1]][[1]]
[1] "1 gallon"

Note from the Author or Editor:
There is a mistake here, but a different one than the one noted by the reporter.

On the top of page 82, shopping.list is incorrectly defined as:

# WRONG
> shopping.list <- list (dairy, fruit)

This should be defined as

# RIGHT
> shopping.list <- list (dairy=dairy, fruit=fruit)

This will then allow the rest of the example on the page to work correctly.

R. Mark Sharp  Jul 30, 2010  Jan 01, 2011
Printed
Page 84
last paragraph

"Vectors are used to represent multidimensional data of a single type."

Probably "Arrays..." was intended rather than "Vectors...".

Note from the Author or Editor:
Good catch. Please change the text to read "Array are used to represent multidimensional data of a single type."

Cedric Duprez  Dec 10, 2010 
Printed
Page 85
1st graf, 2nd line

"...matrices don't have an explicit class attribute."

Probably "...arrays..." was intended rather than "...matrices...".

Note from the Author or Editor:
At the top of page 91 (in the PDF), within the Arrays section, I state

"Liked matrices, and unlike most other classes, matrices don't have an explicit class attribute"

this should be changed to

"Liked matrices, and unlike most other classes, arrays don't have an explicit class attribute"

Anonymous  Jun 10, 2010  Jan 01, 2011
PDF
Page 86
middle of page

In Category: Vectors; Object type: raw; Example CharToRaw("Hello") should be charToRaw("Hello")

Note from the Author or Editor:
good correction

we must have capitalized this incorrectly... please change to charToRaw

rmsharp  Oct 06, 2010 
PDF
Page 88
Immediately after first paragraph of text

Example has
> # a vector of four numbers
> v <- c(.295, .300, .250, .287, 215)
> v
[1] 0.295 0.300 0.350 0.287 0.215

However, this example has five numbers. I believe the illustration would fit with the rest of the page if the example was a follows.
> # a vector of four numbers
> v <- c(.295, .300, .250, .287)
> v
[1] 0.295 0.300 0.350 0.287

Note from the Author or Editor:
Duh... the comment should read

# a vector of 5 numbers

rmsharp  Oct 06, 2010 
105-106
last example

This error was reported via Safari books online. This is exactly what the reader provided:

Chapter 8.5.1
last example forgot to define the doNothing function

Note from the Author or Editor:
Oops, left out the function. Here's the sample function:

doNothing <- function(x) {
message("This function does nothing.")
}

Anonymous  Jul 13, 2010  Jan 01, 2011
PDF
Page 109-110
last line on 109 and code on 110

There are semicolons at the end of most lines of code within this example. On page 83 in summarizing the Google-Styleguide the following directive was written: "Omit semicolons at the end of lines when they are optional.

Since they are optional in the code on pages 109 and 110, they should be removed.

Note from the Author or Editor:
Yes, I think this is a valid point. It's not a serious mistake, but it would improve the book.

I should have stuck with the style guide an omitted the semicolons. (Sorry, old habit from other languages.)

R. Mark Sharp  Jul 31, 2010 
PDF
Page 118
5th line from the bottom

The text says "This family of functions is a good alternative to control structures.". However, there is no discussion or example of how they can be used as control structures. Is this topic covered elsewhere?

Note from the Author or Editor:
"Control structures" mean language features like conditional statements, loops, and go-to statements.

Suppose that you had a vector of numerical values, and wanted to calculate the square of each element. You could do this using a loop:

> v <- 1:20
> w <- NULL
> for (i in 1:length(v)) {w[i] <- v[i]^2}
> w
[1] 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400

However, you can also write this using an "apply" statement like this:

> v <- 1:20
> w <- sapply(v, function(e) {e^2})
> w
[1] 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400

R. Mark Sharp  Jul 31, 2010 
Printed
Page 120
code snippet two thirds of the way down

Variable "my.TimeSeries" is referred to as "my.TimesSeries", with an extra 's'.

This mistake is repeated in the middle of p121.

Note from the Author or Editor:
That is correct. We should change my.TimesSeries to my.TimeSeries wherever it occurs.

Steven J  Jul 31, 2010 
PDF
Page 122
Last sentence in second paragraph in the Changes to Other Environments section

"... then R will assign var to value in the global environment."

I think this should be "... then R will assign value to var in the global environment."

Note from the Author or Editor:
Yes, this is a good catch. Change to "R will assign the value to var in the global environment.

rmsharp  Oct 06, 2010 
Printed
Page 122
First code example

attr(, "package") is wrong and probably was meant to say attr(period, "package")

Note from the Author or Editor:
Good catch. Somehow we lost the word "period" in editing

Lars Francke  Jun 21, 2011 
Printed
Page 127
Top section of the page

In the example at the top of the page, the example slot function calls are incorrect. They should be

slot(birthdate, "month") <- "june"

and further down,

slot(birthdate, "month", check=FALSE) <- "june"

Note from the Author or Editor:
Confirmed. We reversed "birthdate" and "month."

Colin Gillespie
Colin Gillespie
 
Apr 02, 2011 
PDF
Page 131
top paragraph

In section "Old-School OOP in R: S3", "... you have to know how S3 classes they are implemented." should be corrected to "... you have to know how S3 classes are implemented." by removing "they" in the sentence.

Note from the Author or Editor:
Good catch! I am always embarrassed by grammatical errors like this.

Anonymous  Mar 02, 2010  Jan 01, 2011
PDF
Page 132
top paragraph

The top line "S3 classes lack a lot of the structure of S3 objects."

should be changed to ...

"S3 classes lack a lot of the structure of S4 objects."

Note from the Author or Editor:
Change to "S3 classes lack the structure of S4 objects."

Joseph75010  May 27, 2011 
Printed
Page 133
First full paragraph

A parenthetical begins, "A closely related function, NextMethod. NextMethod, is used in a method called..."

The first "NextMethod" and its period should be removed.

Note from the Author or Editor:
I see this on page 141 of the PDF. Yes, this is a valid correction. Thanks!

Steven J  Jul 31, 2010  Jan 01, 2011
PDF
Page 134
Middle of page

The paragraph* that introduces "initialize" is a bit unclear and would benefit from a little expansion.

*If you want to call a function after a new object is created, you may want to define and initialize method for the new object. ..."

Although clarified at the top of page 135, it is not clear here that the author is talking about running a function inside the class as a part of the construction process. I think an ordered list of events that occur during instantiation of a class object would be helpful.

Note from the Author or Editor:
Good suggestion; I didn't describe that clearly.

Change "If you want to call a function after a new object is created, you may want to define an initialize method for the new object." to

"Whenever you create a new object, R will execute the initialize method of the class if the method is available. Programmers usually use the initialize method to calculate values and assign them to slots."

rmsharp  Oct 07, 2010 
PDF
Page 135
Third to last paragraph; second paragraph in the Working with Objects section

The function name for slotNames(o) is split across two lines as in slot
Names(o).

Note from the Author or Editor:
Also on printed page 127

Please change this text so that slotNames(o) is not split across two lines.

rmsharp  Oct 07, 2010 
PDF
Page 144
Bottom Section: Use Environments for Lookup Tables

I often have large data sets and would like to use your suggestion of using the environment and an S4 class to implement the interface. However, I do not know how to do either. Please provide examples or a place to find examples.

Note from the Author or Editor:
Sorry, didn't include a lot of information in the book. I wrote this article to explore how to do this:

http://broadcast.oreilly.com/2010/03/lookup-performance-in-r.html

rmsharp  Oct 08, 2010 
Printed
Page 151
before last paragraph

The load command only works for me if I included the .RData extension in the filename, ie.

load("~/top.5.salaries.RData")

Leaving off the extension I get this error

Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection
In addition: Warning message:
In readChar(con, 5L, useBytes = TRUE) :
cannot open compressed file 'C:/Users/xxx/Documents/My R Files/R_in_a_Nutshell/top.5.salaries', probable reason 'No such file or directory'

Note from the Author or Editor:
There are two errors on this page, actually

Please change

> save(top.5.salaries,
+ file="C:/Documents and Settings/me/My Documents/top.5.salaries.rda")

to read

> save(top.5.salaries,
+ file="C:/Documents and Settings/me/My Documents/top.5.salaries.Rdata")

Also please change

> load("~/top.5.salaries")

to read

> load("~/top.5.salaries.RData")

Anonymous  Sep 22, 2010 
PDF
Page 153
Content of top.5.salaries.csv

The header of the example csv file is missing a comma:
name.last,name.first,team position,salary

It should be:
name.last,name.first,team,position,salary

Note from the Author or Editor:
Good catch. There should be a comma where indicated.

Anonymous  Sep 22, 2011 
PDF
Page 155
code snippet, mid-page

the paste function call appears to be missing the sep="" argument

Note from the Author or Editor:
Good catch. (I'm not sure if we deleted the sep="" during editing, or if the defaults for paste changed.) Here's the correct code:

> sp500 <- read.csv(paste("http://ichart.finance.yahoo.com/table.csv?",
+ "s=%5EGSPC&a=03&b=1&c=1999&d=03&e=1&f=2009&g=m&ignore=.csv",sep=""))

Anonymous  Mar 06, 2010  Jan 01, 2011
PDF
Page 166
Bullet point 6

Path for the bb.db is wrong:
/Library/Frameworks/R.framework/Resources/library/nutshell/bb.db

Should be:
/Library/Frameworks/R.framework/Resources/library/nutshell/extdata/bb.db

Note from the Author or Editor:
This is correct.

I had to move the path to comply with updated CRAN rules for packages.

Anonymous  Sep 22, 2011 
PDF
Page 183
dow30.tickers snippet

I had to rearrange the syntax a little (the "to" dates before the "from" dates in the get.quotes syntax on page 182). However, my real comment is that as of March 2010, "GM" is not a valid symbol string. I substituted "MTLQQ.PK"

So:
# define parts of the URL
(base <- "http://ichart.finance.yahoo.com/table.csv?")
(symbol <- paste("s=", ticker[1], sep=""))
# months are numbered from 00 to 11, so format the month correctly
(from.month <- paste("&g=d&a=",formatC(as.integer(format(from,"%m"))-1,width=2,flag="0"),sep=""))
(from.day <- paste("&b=", format(from,"%d"), sep=""))
(from.year <- paste("&c=", format(from,"%Y"), sep=""))
(to.month <- paste("&d=",formatC(as.integer(format(to,"%m"))-1,width=2,flag="0"),sep=""))
(to.day <- paste("&e=", format(to,"%d"), sep=""))
(to.year <- paste("&f=", format(to,"%Y"), sep=""))
(inter <- paste("&g=", interval, sep=""))
(last <- "&ignore=.csv")
# put together the URL
(url <- paste(base, symbol, to.month, to.day, to.year, from.month, from.day, from.year, last, sep=""))
# get the file
tmp <- read.csv(url);

and

dow.tickers <- c("MMM", "AA", "AXP", "T", "BAC", "BA", "CAT", "CVX", "C",
"KO", "DD", "XOM", "GE", "MTLQQ.PK", "HPQ", "HD", "INTC", "IBM",
"JNJ", "JPM", "KFT", "MCD", "MRK", "MSFT", "PFE", "PG",
"UTX", "VZ", "WMT", "DIS")

Really, this ends up being a commentary on the effect of "sea changes" even on fairly stable data sources and datasets.

Note from the Author or Editor:
I'm not sure that the order matters for the "from" and "to" parts, but it is true that the stocks in the DJIA did change. (See http://www.djaverages.com/?view=industrial&page=overview for an authoritative list). The current list of tickers is:

dow.tickers <- c("MMM", "AA", "AXP", "T", "BAC", "BA", "CAT", "CVX",
"CSCO", "KO", "DD", "XOM", "GE", "HPQ", "HD", "INTC", "IBM",
"JNJ", "JPM", "KFT", "MCD", "MRK", "MSFT", "PFE", "PG", "TRV",
"UTX", "VZ", "WMT", "DIS")

Anonymous  Mar 06, 2010  Jan 01, 2011
PDF
Page 192
Definition of function get.quotes

# months are numbered from 00 to 11, so format the month correctly
from.month <- paste("&a=",
formatC(as.integer(format(from,"%m"))-1,width=2,flag="0"),
sep="");
from.day <- paste("&b=", format(from,"%d"), sep="");
from.year <- paste("&c=", format(from,"%Y"), sep="");

---Here the argument should be "&d=", not "&a".
to.month <- paste("&a=",
formatC(as.integer(format(to,"%m"))-1,width=2,flag="0"),
sep="");
---
to.day <- paste("&e=", format(to,"%d"), sep="");
to.year <- paste("&f=", format(to,"%Y"), sep="");
inter <- paste("&g=", interval, sep="");
last <- "&ignore=.csv";

Note from the Author or Editor:
Good catch.

Here is the full (corrected) code for clarity:

get.quotes <- function(ticker, from=(Sys.Date()-365), to=(Sys.Date()), interval="d") {

# define parts of the URL
base <- "http://ichart.finance.yahoo.com/table.csv?";
symbol <- paste("s=", ticker, sep="");

# months are numbered from 00 to 11, so format the month correctly
from.month <- paste("&a=",
formatC(as.integer(format(from,"%m"))-1,width=2,flag="0"),
sep="");
from.day <- paste("&b=", format(from,"%d"), sep="");
from.year <- paste("&c=", format(from,"%Y"), sep="");
to.month <- paste("&d=",
formatC(as.integer(format(to,"%m"))-1,width=2,flag="0"),
sep="");
to.day <- paste("&e=", format(to,"%d"), sep="");
to.year <- paste("&f=", format(to,"%Y"), sep="");
inter <- paste("&g=", interval, sep="");
last <- "&ignore=.csv";

# put together the url
url <- paste(base, symbol, from.month, from.day, from.year,
to.month, to.day, to.year, inter, last, sep="");

# get the file
tmp <- read.csv(url);
cbind(symbol=ticker,tmp);
}

Anonymous  Dec 08, 2010 
PDF
Page 196
Last Paragraph

Where it says "...create a matrix with four rows of five elements" shoud say "create a matrix with five rows of four elements", since is what it uses in the example.

Note from the Author or Editor:
should be page 186 of the book

Yes, the numbers are reversed

SamuelM  Aug 05, 2010 
Printed
Page 201
1st graf

"When you call t on a vector, the vector is treated as a single row of a matrix." So, the value returned by t will be a matrix with a single column."

This is backwards. A vector is a single column:
> v
[1] 1 2 3 4 5 6 7 8 9 10
> dim(as.matrix(v))
[1] 10 1

And after t'ing, becomes a single row:
> t(v)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 1 2 3 4 5 6 7 8 9 10
> dim(as.matrix(t(v)))
[1] 1 10

Note from the Author or Editor:
Yes, the reporter is correct. The text should read:

"When you call t on a vector, the vector is treated as a single column of a matrix. So the value returned by t will be a matrix with a single row."

Anonymous  Jun 10, 2010  Jan 01, 2011
Printed
Page 202
2nd paragraph

Unless I'm missing something, I think the end of the paragraph should read

The left side of the formula represents the vector to be unstacked (in this case, Close). The right side indicates the groups to create (in this case, symbol).

Note from the Author or Editor:
Correct; please change the sentence "The left side of the formula represents the vector to be unstacked (in this case, symbol). The right side indicates the groups to create (in this case Close)." to "The right side of the formula represents the vector to be unstacked (in this case, symbol). The left side indicates the groups to create (in this case Close).

Anonymous  Sep 23, 2010 
Printed
Page 226
1st code example

This code sample seems to have multiple typos, unless I'm missing something. The first argument to transform is batting.w.names.2008, this seems like it should be batting.2008. In the same expression, "bat=" seems like it should be "bats=" as the next command requires bats to be a factor. Finally, AVG doesn't exist--I assume it's H/AB.

Corrected code:

data(batting.2008)
&#8203;batting.w.names.2008 <- transform(batting.2008, bats=as.factor(bats), throws=as.factor(throws))
batting.w.names.2008$AVG <- batting.w.names.2008$H/batting.w.names.2008$AB
cdplot(bats~G_batting, data=batting.w.names.2008, subset=(batting.w.names.2008$AB>100))&#8203;

Note from the Author or Editor:
I think this looks correct. I believe that batting.w.names.2008 is defined elsewhere in the book, but I can't find it in 2 minutes (so it's not fair to expect readers to find it). Here's a corrected example:

> batting.w.names.2008 <- transform(batting.2008,
+ AVG=H/AB, bats=as.factor(bats), throws=as.factor(throws))
> cdplot(bats~AVG,data=batting.w.names.2008,
+ subset=(batting.w.names.2008$AB>100))

Anonymous  Mar 19, 2010 
PDF
Page 232
The bottom of the page

Dim of the data is 562 253. Still the book says "select all 253 rows in reverse order" while the author selects columns first, and then rows.

> # load the data:
> library(nutshell)
> data(yosemite)
> # check dimensions of data
> dim(yosemite)
[1] 562 253
> # select all 253 rows in reverse order
> yosemite.flipped <- yosemite[,seq(from=253,to=1)]

Note from the Author or Editor:
Oops, those are columns. Corrected in the latest draft.

Anonymous  Jul 18, 2012 
PDF
Page 237
Bottom code example

I believe "acres.harvested" should be replaced with "domestic.catch.2006".

Note from the Author or Editor:
Oops, changed the example while writing, but left in the old code. The example should have read

> pie(domestic.catch.2006, init.angle=100, cex=.6)

Anonymous  Oct 18, 2010 
Printed
Page 257
Bottom--abline examples

In the fourth comment, "y-intercept" is misspelled as "y-inctercept."

Note from the Author or Editor:
Good find

Please change "y-inctercept" to "y-intercept"

Jeffrey Tyzzer  Jan 31, 2011 
Printed
Page 265
Above Figure 15-1

Figure 15-2, Simple scatter plot with conditioning variable is ostensibly generated with the following statement:

> package(lattice)
> xyplot(y~x|z, data=d)

This code should instead read:

> library(lattice)
> xyplot(y~x|z, data=d)

Note from the Author or Editor:
This is correct. Thanks for catching this!

Anonymous  Mar 22, 2010  Jan 01, 2011
Printed
Page 274
Final code block

The block begins with the comment:
# code to create file: remove from final version of book

It goes on to include code to generate Figure 15-7 for the book, rather than just the example.

Note from the Author or Editor:
The report is correct. This text should have been cut from the book. It's not wrong, just not distracting and not helpful

Brendan Hickey  Jun 29, 2010  Jan 01, 2011
Printed
Page 302
First text paragraph

To demonstrate the levelplot function, we are asked to use the latitude and longitude variables. However as far as I can tell, although these variables are described on p291, they have been replaced in the included data set (from the nutshell package) with the single variable neighborhood. Hence we are unable to work through the examples on p302 and 303.

Note from the Author or Editor:
Good catch.

I will fix this by uploading an updated data set to CRAN.

Kendra Vant  Mar 07, 2011 
Printed
Page 325
3rd paragraph

stem(subset(field.goals,PLTYPE=="FG no")$YARDS) should be stem(subset(field.goals,play.type=="FG no")$yards) in order to work with the data set

Note from the Author or Editor:
Good catch; changing to the correct code in the second edition.

patrick  Oct 12, 2011 
Printed
Page 334
2nd paragraph, code

I typed the example code for bootstrapping, and R was unable to find the data set home.sale.prices.june2008.

Could you clarify where this data can be obtained?

Note from the Author or Editor:
The nutshell package was removed from CRAN. I am updating and correcting the package now.

Kristin Mull  Aug 04, 2011 
Printed
Page 336
Bottom of first code block

The last four lines of the code block should surely read

> # what is the probability that the value is within
> # 1.96 standard deviations of the mean?
> pnorm(1.96,lower.tail=TRUE) - pnorm(-1.96,lower.tail=TRUE)
[1] 0.9500042

A sanity check tells us that the probability that a value falls WITHIN 1.96 standard deviations of the mean should be a big number not a small one!

Note from the Author or Editor:
Good catch.

The code above is correct

Kendra Vant  Mar 10, 2011 
PDF
Page 344
5th paragraph ("Here's an explanation of the output..."

The text reads: "The p-value means that the probability
that the mean value from an actual sample was higher than 10.812 (or lower than 7.288) was 0.4684. This means that it is very likely that the true mean time to failure was 10."

To the best of my understanding there is a mix of a few typos and a conceptual mistake here:

1) ".. was higher than 10.812" must read "... was higher than 10.182" instead (i.e. probability that the mean of the sample randomly drawn from normal with mu=9 would be higher than the mean of the sample at hand, which is 10.182).

2) "...(or lower than 7.288)" must read "...(or lower than 7.818)". Indeed, the observed average 10.182 is 1.182 *higher* than the postulated mu=9; since we are running two-sided t-test, we also account for samples with mean lower than 9 by the same amount, i.e. lower than 9-1.182=7.818

3) The last sentence: "... likely that the true mean time to failure was 10" was probably intended to read "... likely that the true mean time to failure was 9" since mu=9 was the null hypothesis we were testing against.

4) But there is actually a conceptual mistake here ( 3) above). We observed a sample with mean 10.182. How can this make mu=9 likely??? In the context of hypothesis testing and t-test in particular, null is *postulated*. The testing is not symmetric, in a sense that if we observe a highly unusual value (e.g. something with mean 28 instead of postulated mu=9), then we can *reject* the null. But if we observe something consistent with the null (as it is the case with the example at hand: p-value ~0.4 tells us that we could easily get the sample with mean 10.182 or even greater from normal with mu=9), this does not make the null (more) likely. We just can not reject it. For all we know, null could be still wrong, but our sample is just too small. Maybe if we got more data we would be still able to reject the null. So the last sentence should read, I believe, something like: "This means that the observed sample does not give us any basis to reject our null assumption that the mean time to failure is 9".

credit: discovered by my student

Note from the Author or Editor:
Here are the changes to make to correct the text:

(a) Change the sentence "The p-value means that the prob- ability that the mean value from an actual sample was higher than 10.812 (or lower than 7.288) was 0.4684" to read "The p-value means that the prob- ability that the mean value from an actual sample was higher than 10.182 (or lower than 7.818) was 0.4684"

(b) Remove the sentence "This means that it is very likely that the true mean time to failure was 10."

(c) In the last paragraph, after the sentence "Next, the t.test function shows the 95% confidence intervals for this test, and, finally, it gives the actual mean." add the line "As a statistician would say, this evidence does not imply that the true mean was not equal to 9."

Andrey Sivachenko  Jun 05, 2011 
Printed
Page 344
second para

Here and through the text, the author equates the "null" hypothesis with the "alternative" hypothesis. These are not equivalents, but opposites.

Note from the Author or Editor:
I checked the text for places where I confused "null hypothesis" and "alternative hypothesis." Heres' the list of corrections:

- On page 344, change the sentence "One common question is to ask if the mean of the experimental data is different from what it would be if the hypothesis was not true (which is called the null hypothesis or the alternative hypothesis)." to "One common question is to ask if the mean of the experimental data is close to what the experimenter expected; this is called the null hypothesis. Alternately, the experimenter may calculate the probability that an alternative hypothesis was true."
- On page 362, change the sentence "The alternative (or null) hypothesis is that the two vari- ables are not independent." to "The alternative hypothesis is that the two variables are not independent.

These are serious technical mistakes, but luckily they're isolated to a couple instances.

Anonymous  Feb 13, 2010  Mar 01, 2010
Printed
Page 345
throughout

Text in the first graf, "Suppose we thought... tires of type H should last... 8 hours until failure" doesn't jibe with the code example, wherein "mu=9", or text in the last graf, "true mean is not equal to 9."

Note from the Author or Editor:
Good catch.

The text should be changed from:

"Suppose that we thought, a priori, that tires of type H should last for approximately 8 hours until failure."

to


"Suppose that we thought, a priori, that tires of type H should last for approximately 9 hours until failure."

Anonymous  Jun 29, 2010  Jan 01, 2011
Printed
Page 355
355

On the top of page 355, I think there is a misinterpretation of the p-value from the Shapiro-Wilk normality test.

A p-value < 0.05 indicates that the data is likely not normally distributed. Therefore the sentence at the top should say:

"As you can see from the p-value, its quite likely that this data is *not* normally distributed."

Note from the Author or Editor:
The test is covered on pages 376-377. This errata is correct; a p value < .05 indicates that the data is not normally distributed. Therefore, the text should read,

"As you can see from the p-value, it is quite likely that this data is not normally distributed."

Ethan Cerami  Apr 05, 2010  Jan 01, 2011
Printed
Page 356
1st paragraph

For the Kolmogorov-Smirnov test, a significant p-value suggests that the distributions are different, not the same (as stated in the text).

"the two vectors likely came from the same distribution" should be "the two vectors likely did not come from the same distribution".

Note from the Author or Editor:
Please change the text from

"According to the p-value, the two vectors likely came from the same distribution (at a 95% confidence level)."

to

"The p-value of this test was 0.0168, which is much less that .05. So, this test implies that it was not likely that these two samples came from the same distribution."

Andrew Walsh  Apr 06, 2011 
Printed
Page 357
heading

Use of the term "distribution-free tests" is archaic. These are non-parametric tests, which term is used once later in the text. You will just confuse readers.

Note from the Author or Editor:
I think this error might be valid. Peter Dalgaard uses the term "distribution-free tests" in the book "Introductory Statistics with R." The term "non-parametric" may be preferable. It is used in

- a section header on page 357
- in the sentence "A good alternative to the tests described in ?Normal Distribution-Based Tests? on page 344 are distribution-free tests."
- in the sentence "The Wilcoxon test is the distribution-free equivalent to the t-test:"
- in the line "The Kruskal-Wallis rank sum test is a distribution-free equivalent to ANOVA analysis:"
- In the header "Distribution-Free Tabular Data Tests" on page 368
- In the line "The Friedman rank sum test is a distribution-free relative of the chi-squared test. In R, this is implemented through the friedman.test function:" on page 368

In each case, I believe that we should replace "distribution-free" with "non-parametric"

Anonymous  Feb 13, 2010  Mar 01, 2010
Printed
Page 364
at the bottom of the page, section 'Tabular Data Tests'

Instead of 5325 there should be 5326 in the numerator.

Note from the Author or Editor:
This is correct; the code should read:

> 5326 / (5326 + 6067)
[1] 0.46748

Rok Krsmanc  Aug 03, 2011 
PDF
Page 368
first paragraph

The statement

"The Friedman rank sum test is a distribution-free relative of the chi-squared test."

is not quite correct. It should say

"The Friedman rank sum test is a non-parametric counterpart to two way ANOVA tests."

Joseph Adler
 
May 07, 2010  Jan 01, 2011
Printed
Page 378-379
throughout

Apparent editing or cut/paste problem -- the 3 grafs on page 377 beginning "To get the list of coefficients for the model..." is repeated verbatim on pages 378-379.

Note from the Author or Editor:
This is correct. The text on the bottom of page 378 (printed book) starting with "To get the list of coefficients for a model object..." and ending with "Alternatively, you can use the alias coefficients to access the coef function" on the top of page 379 should all be removed.

Anonymous  Jun 29, 2010  Jan 01, 2011
Printed
Page 381
middle/ bottom

List of plot outputs in bullet-point format on bottom 1/3 of page isn't in the same order as "caption=..." list in the code sample above.

Note from the Author or Editor:
The author is correct. The bullet list would be more clear if it matched the order shown in the function definition.

Anonymous  Jun 29, 2010  Jan 01, 2011
Printed
Page 385
in the middle of the page, section 'Assumptions of Least Squares Regression'

Function name 'ncv.test' is deprecated. Use 'ncvTest' instead.

Note from the Author or Editor:
Yes, it appears that the library was changed. This was correct at the time of writing.

Rok Krsmanc  Aug 03, 2011 
PDF
Page 385
3rd and 4th paragraphs

All of these

> hpi.lm <- lm(Index~Year,data=shiller.index)
> hpi.rlm <- rlm(Index~Year,data=shiller.index)
> hpi.lqs <- lqs(Index~Year,data=shiller.index)

Should be

> hpi.lm <- lm(Real.Home.Price.Index~Year,data=schiller.index)
> hpi.rlm <- rlm(Real.Home.Price.Index~Year,data=schiller.index)
> hpi.lqs <- lqs(Real.Home.Price.Index~Year,data=schiller.index)

And then

> plot(hpi,pch=19,cex=0.3)

Should be

> plot(schiller.index,pch=19,cex=0.3)

Note from the Author or Editor:
Good catch; I cleaned up the sample data while writing the book, but didn't fix all the examples. Thanks for sending this in!

Maksim Djackov  Aug 16, 2011 
PDF
Page 387, 402-404
middle, bottom

We misspelled Robert Shiller's name. (In the book, it is spelled as "Schiller," but should be "Shiller."

Errors are also found on page 402, 403, and 404.

Joseph Adler
 
Mar 22, 2010  Jan 01, 2011
PDF
Page 590
5th entry from top

The definition for the "timestamp" function is incorrect; it should be "Writes a timestamp (or other message) into the history and echos it to the console."

Joseph Adler
 
Mar 13, 2010  Jan 01, 2011