Errata

Natural Language Processing with Python

Errata for Natural Language Processing with Python

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted By Date submitted Date corrected
Printed
Page 1
.

The Natural Language Toolkit has been updated for Python 3.0, and the online version of the book has been updated. Please install NLTK 3 and consult the updated book (http://www.nltk.org/book) and user discussion forum (https://groups.google.com/group/nltk-users/) before reporting errata.

Anonymous  Oct 15, 2014 
Printed
Page 9
16 lines down


lexical diversity() s/b lexical_diversity() -- with underscore instead of space

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 18
Axis label of plot

The source "fdist1.plot(50,cumulative=True) gives a Y axis in counts and the label is "Cumulative Counts" rather than a Y axis in percentage and a label of "Cumulative Percentage". This using Python 2.6.2 and the nltk and plotting packages downloaded on 9/13/09.

Note from the Author or Editor:
This has been addressed in second printing.

Bob Doherty  Sep 14, 2009  Jan 01, 2010
Printed
Page 18
Figure 1-4


Fig 1-4 or the code that creates it needs to be fixed (currently NLTK does counts, not percentages)

Note from the Author or Editor:
Already resolved on website and in second printing (December)

Anonymous  Dec 16, 2009  Jan 01, 2010
PDF
Page 29
Generating Language Output 2nd paragraph

"ils if the thieves are sold, and elle if the paintings are sold."

This is not exactly wrong, still:
"sold" should be replaced by "found" as "found" is used in the subsequent example and "selling" thieves is well...a little odd ;)

Note from the Author or Editor:
Agreed. In chapter 1, the sentence:

if the thieves are sold, ... if the paintings are sold.

Should be changed to:

if the thieves are found, ... if the paintings are found.

Maximilian Scherr  Jan 15, 2011 
PDF
Page 29
Generating Language Output 2nd paragraph

"ils if the thieves are sold, and elle if the paintings are sold."

This is not exactly wrong, still:
"sold" should be replaced by "found" as "found" is used in the subsequent example and "selling" thieves is well...a little odd ;)

Note from the Author or Editor:
Agreed.

if the thieves are sold, ... if the paintings are sold.

Should be changed to:

if the thieves are found, ... if the paintings are found.



Maximilian Scherr  Jan 15, 2011 
46
first code block

Argument tuple to nltk.ConditionalFreqDist should be

(target,fileid[:4])

not

(target,file[:4])

.

Note from the Author or Editor:
Please see:
http://code.google.com/p/nltk/issues/detail?id=417

Already fixed in online version

Andrew C Young  Jul 08, 2009 
Printed
Page 46
Figure 2.1


More contrast (supplied image was color)

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed, PDF, , Other Digital Version
Page 83
n/a

In the section on processing rss feeds, a line reads:

>>> nltk.word_tokenize(nltk.html_clean(content))

However, html_clean should be clean_html

Steven Bird
Steven Bird
 
Nov 17, 2010 
Printed, PDF, , Other Digital Version
Page 83
n/a

In the section on processing rss feeds, a line reads:

>>> nltk.word_tokenize(nltk.html_clean(content))

However, html_clean should be clean_html

Steven Bird
Steven Bird
 
Nov 17, 2010 
Printed
Page 88
"Your turn" code example, bottom of page

"for line in b: print b" should have been "for line in b: print line".

Note from the Author or Editor:
Please see:
http://code.google.com/p/nltk/issues/detail?id=418

Fixed in online version. Will be fixed in next printing.

Robin Munn  Jul 08, 2009  Jan 01, 2010
PDF
Page 92
Table 3-2

s.titlecase() A titlecased version of the string s

=>

s.title() A titlecased version of the string s

Note from the Author or Editor:
Please see:
http://code.google.com/p/nltk/issues/detail?id=419

Fixed in online version. Will be corrected in next printing.

Anonymous  Jun 24, 2009  Jan 01, 2010
Printed
Page 113
.

the pronunciation of the Chinese character 国 ("country") is given as *guo3*; it should actually be *guo2*

Note from the Author or Editor:
The proposed correction will be incorporated in the next issue of the book.

Anonymous  May 15, 2013 
Printed
Page 115
Immediately after Example 3-3

"The final step is to search for the pattern of zeros and ones that maximizes this objective function, shown in Example 3.10. "
Shouldn't it be "The final step is to search for the pattern of zeros and ones that MINIMIZES this objective function..."? At least that's what the anneal function does...

Note from the Author or Editor:
Addressed in second printing.

Uzi Halaby-Senerman  Sep 02, 2009  Jan 01, 2010
Printed
Page 132
9 lines up


"makes detection is easier" s/b "makes detection easier"

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 144
16 lines up


"an empty dictionary" s/b "an empty list"

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed, PDF
Page 152
line 17

[len(w) for w in nltk.corpus.brown.sents(categories='news'))]

should be

[len(w) for w in nltk.corpus.brown.sents(categories='news')]

Note from the Author or Editor:
Agreed.

Jun Utsumi  Apr 24, 2011 
Printed
Page 153
3 lines down


Add quotes around "in-place dictionary"
>> add following sentence: (Dictionaries will be presented in Section 5.3.)

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 153 & 154
Bottom and top of 154


Code block spanning page break: variable "trace" should be renamed to "verbose" x4

Anonymous  Dec 16, 2009  Jan 01, 2010
PDF
Page 163
Example 4-6

There is no need to use nltk.defaultdict in this example.
The following code works fine.

#trie = nltk.defaultdict(dict)
trie = {}
insert(trie, 'chat', 'cat')
insert(trie, 'chien', 'dog')
insert(trie, 'chair', 'flesh')
insert(trie, 'chic', 'stylish')

#trie = dict(trie) # for nicer printing
trie['c']['h']['a']['t']['value']
pprint.pprint(trie)

Note from the Author or Editor:
I agree. The line:

trie = nltk.defaultdict(dict)

should be changed to:

trie = {}

mg6t  Sep 01, 2011 
PDF
Page 165
5th line

>>> statement = "random.randint(0, %d) in vocab" % vocab_size * 2

This line should be corrected as follows:

>>> statement = "random.randint(0, %d) in vocab" % (vocab_size * 2)

Note from the Author or Editor:
Confirmed.

mg6t  Aug 28, 2011 
Printed
Page 172
3 lines down


"dendogram" s/b "dendrogram"

Anonymous  Dec 16, 2009  Jan 01, 2010
PDF
Page 175
Exercise No. 19

The nltk.corpus.wordnet object does not have path_distance().
It must be path_similarity().

Note from the Author or Editor:
The exercise should be changed to specify shortest_path_distance() instead of path_distance().

mg6t  Sep 01, 2011 
Printed
Page 177
Example 33


Move to chapter 5 (new exercise 43). Change reference "described in chapter 5" to "described in this chapter"

Anonymous  Dec 16, 2009  Jan 01, 2010
PDF
Page 207
center of the page

The line
>>> print nltk.ConfusionMatrix(gold, test)
causes an error.
It should be
>>> print nltk.ConfusionMatrix(gold_tags, test_tags)

Note from the Author or Editor:
Agreed. The line:

print nltk.ConfusionMatrix(gold, test)

should be:

print nltk.ConfusionMatrix(gold_tags, test_tags)

mg6t  Aug 29, 2011 
PDF
Page 273
Example 2-6

In example 7.4, the last statement in the parse method calls the method nltk.chunk.conlltags2tree. In NLTK 2.0.4 nltk.chunk has no such method. The call should be to nltk.chunk.util.conlltags2tree.

Note from the Author or Editor:
This error has been fixed by changing the imports at the level of the nltk.chunk package.

Peter Haglich  Jan 09, 2013 
PDF
Page 284
Code under 7.6

There is a method call to nltk.sem.show_raw_rtuple, which doesn't exist in NLTK 2.0.4. The call should be nltk.sem.relextract.show_raw_rtuple.

Note from the Author or Editor:
The interface to this module has been updated (see https://raw.github.com/nltk/nltk/master/ChangeLog), and the source files of the book have been revised accordingly.

Peter Haglich  Jan 09, 2013 
Printed
Page 306
17 lines up


"The advantages of shift-reduce" s/b "The advantage of shift-reduce"

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 309
9 lines up


"through entire list" s/b "through the entire list"

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 309
13-14 lines up


"Det at wfst[0][1] and N at wfst[1][2], we can add NP to wfst[0][2]" s/b
"Det at wfst[2][3] and N at wfst[3][4], we can add NP to wfst[2][4]"

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 334
10 lines down


Delete this whole line, viz "NP[NUM=?n] -> N[NUM=?n]", and close up space.

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 336
Figure 9.1


Larger scale (closer in size to example (18) same page), fix broken vbars (reported as too big last time, but now it is too small.)

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 336
Figure 9-1


Fig 9-1 is too big in the latest pdf. Also, the feature labels shouldn't be bold.

Anonymous  Dec 16, 2009  Feb 01, 2010
Printed
Page 339
Diagram (23)


Incorrect diagram; it should be the one found here:

http://nltk.googlecode.com/svn/trunk/doc/book/ch09.html#ex-dag04

Note from the Author or Editor:
Already resolved on website and in second printing (December)

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 340
Example 24


s/b smaller for consistency with the other DAGs (cf p339)

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 342
DAG (27a)


DAG (27a) is incorrect. It should look just like (27c) but *without* the middle arc labeled 'CITY'. (The online version of this chapter is correct, and uses dag04-1.png for this subfigure.)

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 355
Code block


Remove box from code block

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 363
1st paragraph

Within the code following the the sentence "This allows us to parse a query into SQL:", there is a line that assigns a value to the 'answer' variable:

answer = trees[0].node['sem']

When I try to enter this line, I get a stack trace followed by this error message:

KeyError: 'sem'

When I enter

answer = trees[0].node['SEM']

I get the prompt back, without any stack trace or error message.

Note from the Author or Editor:

It should say:

answer = trees[0].node['SEM']

Addressed in 2nd printing.

Vance Arocho  Dec 01, 2009  Jan 01, 2010
Printed
Page 363
21 lines down


node['sem'] s/b node['SEM']
NB This is http://www.oreillynet.com/cs/nl/edit/errata/40392

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 373
Approx halfway down the page

"... in this context it could be of some other type, such as <e, e> or <e, <e, t>.": Should be "... in this context it could be of some other type, such as <e, e> or <e, <e, t>>." (unbalanced angle brackets.)

Note from the Author or Editor:
Addressed in second printing.

Bruce C. Baker  Sep 19, 2009  Jan 01, 2010
Printed
Page 373
19 lines down


"such as <e, e> or <e, <e, t>." s/b
"such as <e, e> or <e, <e, t>>."

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 382
Figure (28)


Smaller scale

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 385
Bottom of page

In the following two lines, the string "?subj" should be replaced with
"?np" (two substitutions):

(30) S[SEM=<?vp(?np)>] -> NP[SEM=?subj] VP[SEM=?vp]

(30) tells us that given some sem value ?subj for the subject NP and
some sem value ?vp for the VP

Steven Bird
Steven Bird
 
Oct 06, 2010  Nov 01, 2010
Printed
Page 389
17 lines up


"nltk.Variable('z')" s/b "nltk.sem.Variable('z')"

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 391
6 lines down


Insert space before "yields"

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 392
6 lines down


"nltk.ApplicationExpression(tvp, np)" s/b
"nltk.sem.ApplicationExpression(tvp, np)"

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 393
8 lines up


semrel s/b semrep

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 393
5 lines up


exists z3.(ankle(z3) & bite(cyril,z3))
s/b
all z4.(boy(z4) -> see(cyril,z4))

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 395
8 lines up from bottom


"core", "store" s/b uc in the SEM value of VP

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 395
5-9 lines up

The string "?subj" should be replaced with "?np" (three substitutions):

S[SEM=[core=<?vp(?subj)>, store=(?b1+?b2)]] ->
NP[SEM=[core=?subj, store=?b1]] VP[SEM=[core=?vp, store=?b2]]

The core value at the S node is the result of applying the VP's core value, namely \x.smile(x), to the subject NP's value. The latter will not be @x, but rather an instantiation of @x, say z3. After &#946;-reduction, <?vp(?subj)> will be unified with <smile(z3)>.

Steven Bird
Steven Bird
 
Oct 06, 2010  Nov 01, 2010
Printed
Page 396
20 lines up


"trees[0].node['sem']" s/b "trees[0].node['SEM']"

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 399
4 lines down


Det[NUM=sg,SEM=<\P Q.([x],[]) + P(x) + Q(x)>] -> 'a'
s/b
Det[NUM=sg,SEM=<\P Q.(([x],[]) + P(x) + Q(x))>] -> 'a'

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 400
20 lines down


"trees[0].node['sem'].simplify()" s/b
"trees[0].node['SEM'].simplify()"

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 405-406
Examples 5-7


Please replace all seven occurrences of
"nltk.ApplicationExpression" with "nltk.sem.ApplicationExpression".

Anonymous  Dec 16, 2009  Dec 01, 2009
Printed
Page 426
top

Currently reads:

</sense>
<gloss> ... </gloss>
<synset> ... </synset>
</sense>
...

Should be:

</sense>
<sense>
<gloss> ... </gloss>
<synset> ... </synset>
</sense>
...

Bruce C. Baker  Sep 27, 2009  Jan 01, 2010
Printed
Page 429
11-12 lines down


Sentence beginning with "Ignoring...", please replace with the following (and set "OTH" in cw):

Ignoring the entries for exchanges between people other than the top 5 (labeled OTH), the largest value suggests that Portia and Bassanio have the most significant interactions.

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 444
7 lines down


can never been known s/b can never be known

Anonymous  Dec 16, 2009  Jan 01, 2010
Printed
Page 467
Toward bottom of left hand column


"deve-test" -> "dev-test"

Anonymous  Dec 16, 2009  Jan 01, 2010