Errata for Natural Language Processing with Python
Submit your own errata for this product.
The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".
The following errata were submitted by our customers and approved as valid errors by the author or editor.
Color Key: Serious Technical Mistake Minor Technical Mistake Language or formatting error Typo Question Note Update
| Version |
Location |
Description |
Submitted By |
Date Submitted |
Date Corrected |
| Printed |
Page 9
16 lines down |
lexical diversity() s/b lexical_diversity() -- with underscore instead of space
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 18
Axis label of plot |
The source "fdist1.plot(50,cumulative=True) gives a Y axis in counts and the label is "Cumulative Counts" rather than a Y axis in percentage and a label of "Cumulative Percentage". This using Python 2.6.2 and the nltk and plotting packages downloaded on 9/13/09.
Note from the Author or Editor: This has been addressed in second printing.
|
Bob Doherty |
Sep 14, 2009 |
Jan 01, 2010 |
| Printed |
Page 18
Figure 1-4 |
Fig 1-4 or the code that creates it needs to be fixed (currently NLTK does counts, not percentages)
Note from the Author or Editor: Already resolved on website and in second printing (December)
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| PDF |
Page 29
Generating Language Output 2nd paragraph |
"ils if the thieves are sold, and elle if the paintings are sold."
This is not exactly wrong, still:
"sold" should be replaced by "found" as "found" is used in the subsequent example and "selling" thieves is well...a little odd ;)
Note from the Author or Editor: Agreed. In chapter 1, the sentence:
if the thieves are sold, ... if the paintings are sold.
Should be changed to:
if the thieves are found, ... if the paintings are found.
|
Maximilian Scherr |
Jan 15, 2011 |
|
| PDF |
Page 29
Generating Language Output 2nd paragraph |
"ils if the thieves are sold, and elle if the paintings are sold."
This is not exactly wrong, still:
"sold" should be replaced by "found" as "found" is used in the subsequent example and "selling" thieves is well...a little odd ;)
Note from the Author or Editor: Agreed.
if the thieves are sold, ... if the paintings are sold.
Should be changed to:
if the thieves are found, ... if the paintings are found.
|
Maximilian Scherr |
Jan 15, 2011 |
|
| Safari Books Online |
46
first code block |
Argument tuple to nltk.ConditionalFreqDist should be
(target,fileid[:4])
not
(target,file[:4])
.
Note from the Author or Editor: Please see:
http://code.google.com/p/nltk/issues/detail?id=417
Already fixed in online version
|
Andrew C Young |
Jul 08, 2009 |
|
| Printed |
Page 46
Figure 2.1 |
More contrast (supplied image was color)
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed, PDF, Safari Books Online, Other Digital Version |
Page 83
n/a |
In the section on processing rss feeds, a line reads:
>>> nltk.word_tokenize(nltk.html_clean(content))
However, html_clean should be clean_html
|
 Steven Bird
|
Nov 17, 2010 |
|
| Printed, PDF, Safari Books Online, Other Digital Version |
Page 83
n/a |
In the section on processing rss feeds, a line reads:
>>> nltk.word_tokenize(nltk.html_clean(content))
However, html_clean should be clean_html
|
 Steven Bird
|
Nov 17, 2010 |
|
| Printed |
Page 88
"Your turn" code example, bottom of page |
"for line in b: print b" should have been "for line in b: print line".
Note from the Author or Editor: Please see:
http://code.google.com/p/nltk/issues/detail?id=418
Fixed in online version. Will be fixed in next printing.
|
Robin Munn |
Jul 08, 2009 |
Jan 01, 2010 |
| PDF |
Page 92
Table 3-2 |
s.titlecase() A titlecased version of the string s
=>
s.title() A titlecased version of the string s
Note from the Author or Editor: Please see:
http://code.google.com/p/nltk/issues/detail?id=419
Fixed in online version. Will be corrected in next printing.
|
Anonymous |
Jun 24, 2009 |
Jan 01, 2010 |
| Printed |
Page 115
Immediately after Example 3-3 |
"The final step is to search for the pattern of zeros and ones that maximizes this objective function, shown in Example 3.10. "
Shouldn't it be "The final step is to search for the pattern of zeros and ones that MINIMIZES this objective function..."? At least that's what the anneal function does...
Note from the Author or Editor: Addressed in second printing.
|
Uzi Halaby-Senerman |
Sep 02, 2009 |
Jan 01, 2010 |
| Printed |
Page 132
9 lines up |
"makes detection is easier" s/b "makes detection easier"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 144
16 lines up |
"an empty dictionary" s/b "an empty list"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed, PDF |
Page 152
line 17 |
[len(w) for w in nltk.corpus.brown.sents(categories='news'))]
should be
[len(w) for w in nltk.corpus.brown.sents(categories='news')]
Note from the Author or Editor: Agreed.
|
Jun Utsumi |
Apr 24, 2011 |
|
| Printed |
Page 153
3 lines down |
Add quotes around "in-place dictionary"
>> add following sentence: (Dictionaries will be presented in Section 5.3.)
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 153 & 154
Bottom and top of 154 |
Code block spanning page break: variable "trace" should be renamed to "verbose" x4
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| PDF |
Page 163
Example 4-6 |
There is no need to use nltk.defaultdict in this example.
The following code works fine.
#trie = nltk.defaultdict(dict)
trie = {}
insert(trie, 'chat', 'cat')
insert(trie, 'chien', 'dog')
insert(trie, 'chair', 'flesh')
insert(trie, 'chic', 'stylish')
#trie = dict(trie) # for nicer printing
trie['c']['h']['a']['t']['value']
pprint.pprint(trie)
Note from the Author or Editor: I agree. The line:
trie = nltk.defaultdict(dict)
should be changed to:
trie = {}
|
mg6t |
Sep 01, 2011 |
|
| PDF |
Page 165
5th line |
>>> statement = "random.randint(0, %d) in vocab" % vocab_size * 2
This line should be corrected as follows:
>>> statement = "random.randint(0, %d) in vocab" % (vocab_size * 2)
Note from the Author or Editor: Confirmed.
|
mg6t |
Aug 28, 2011 |
|
| Printed |
Page 172
3 lines down |
"dendogram" s/b "dendrogram"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| PDF |
Page 175
Exercise No. 19 |
The nltk.corpus.wordnet object does not have path_distance().
It must be path_similarity().
Note from the Author or Editor: The exercise should be changed to specify shortest_path_distance() instead of path_distance().
|
mg6t |
Sep 01, 2011 |
|
| Printed |
Page 177
Example 33 |
Move to chapter 5 (new exercise 43). Change reference "described in chapter 5" to "described in this chapter"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| PDF |
Page 207
center of the page |
The line
>>> print nltk.ConfusionMatrix(gold, test)
causes an error.
It should be
>>> print nltk.ConfusionMatrix(gold_tags, test_tags)
Note from the Author or Editor: Agreed. The line:
print nltk.ConfusionMatrix(gold, test)
should be:
print nltk.ConfusionMatrix(gold_tags, test_tags)
|
mg6t |
Aug 29, 2011 |
|
| Printed |
Page 306
17 lines up |
"The advantages of shift-reduce" s/b "The advantage of shift-reduce"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 309
9 lines up |
"through entire list" s/b "through the entire list"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 309
13-14 lines up |
"Det at wfst[0][1] and N at wfst[1][2], we can add NP to wfst[0][2]" s/b
"Det at wfst[2][3] and N at wfst[3][4], we can add NP to wfst[2][4]"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 334
10 lines down |
Delete this whole line, viz "NP[NUM=?n] -> N[NUM=?n]", and close up space.
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 336
Figure 9.1 |
Larger scale (closer in size to example (18) same page), fix broken vbars (reported as too big last time, but now it is too small.)
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 336
Figure 9-1 |
Fig 9-1 is too big in the latest pdf. Also, the feature labels shouldn't be bold.
|
Anonymous |
Dec 16, 2009 |
Feb 01, 2010 |
| Printed |
Page 339
Diagram (23) |
Incorrect diagram; it should be the one found here:
http://nltk.googlecode.com/svn/trunk/doc/book/ch09.html#ex-dag04
Note from the Author or Editor: Already resolved on website and in second printing (December)
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 340
Example 24 |
s/b smaller for consistency with the other DAGs (cf p339)
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 342
DAG (27a) |
DAG (27a) is incorrect. It should look just like (27c) but *without* the middle arc labeled 'CITY'. (The online version of this chapter is correct, and uses dag04-1.png for this subfigure.)
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 355
Code block |
Remove box from code block
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 363
1st paragraph |
Within the code following the the sentence "This allows us to parse a query into SQL:", there is a line that assigns a value to the 'answer' variable:
answer = trees[0].node['sem']
When I try to enter this line, I get a stack trace followed by this error message:
KeyError: 'sem'
When I enter
answer = trees[0].node['SEM']
I get the prompt back, without any stack trace or error message.
Note from the Author or Editor:
It should say:
answer = trees[0].node['SEM']
Addressed in 2nd printing.
|
Vance Arocho |
Dec 01, 2009 |
Jan 01, 2010 |
| Printed |
Page 363
21 lines down |
node['sem'] s/b node['SEM']
NB This is http://www.oreillynet.com/cs/nl/edit/errata/40392
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 373
Approx halfway down the page |
"... in this context it could be of some other type, such as <e, e> or <e, <e, t>.": Should be "... in this context it could be of some other type, such as <e, e> or <e, <e, t>>." (unbalanced angle brackets.)
Note from the Author or Editor: Addressed in second printing.
|
Bruce C. Baker |
Sep 19, 2009 |
Jan 01, 2010 |
| Printed |
Page 373
19 lines down |
"such as <e, e> or <e, <e, t>." s/b
"such as <e, e> or <e, <e, t>>."
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 382
Figure (28) |
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 385
Bottom of page |
In the following two lines, the string "?subj" should be replaced with
"?np" (two substitutions):
(30) S[SEM=<?vp(?np)>] -> NP[SEM=?subj] VP[SEM=?vp]
(30) tells us that given some sem value ?subj for the subject NP and
some sem value ?vp for the VP
|
 Steven Bird
|
Oct 06, 2010 |
Nov 01, 2010 |
| Printed |
Page 389
17 lines up |
"nltk.Variable('z')" s/b "nltk.sem.Variable('z')"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 391
6 lines down |
Insert space before "yields"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 392
6 lines down |
"nltk.ApplicationExpression(tvp, np)" s/b
"nltk.sem.ApplicationExpression(tvp, np)"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 393
8 lines up |
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 393
5 lines up |
exists z3.(ankle(z3) & bite(cyril,z3))
s/b
all z4.(boy(z4) -> see(cyril,z4))
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 395
8 lines up from bottom |
"core", "store" s/b uc in the SEM value of VP
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 395
5-9 lines up |
The string "?subj" should be replaced with "?np" (three substitutions):
S[SEM=[core=<?vp(?subj)>, store=(?b1+?b2)]] ->
NP[SEM=[core=?subj, store=?b1]] VP[SEM=[core=?vp, store=?b2]]
The core value at the S node is the result of applying the VP's core value, namely \x.smile(x), to the subject NP's value. The latter will not be @x, but rather an instantiation of @x, say z3. After β-reduction, <?vp(?subj)> will be unified with <smile(z3)>.
|
 Steven Bird
|
Oct 06, 2010 |
Nov 01, 2010 |
| Printed |
Page 396
20 lines up |
"trees[0].node['sem']" s/b "trees[0].node['SEM']"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 399
4 lines down |
Det[NUM=sg,SEM=<\P Q.([x],[]) + P(x) + Q(x)>] -> 'a'
s/b
Det[NUM=sg,SEM=<\P Q.(([x],[]) + P(x) + Q(x))>] -> 'a'
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 400
20 lines down |
"trees[0].node['sem'].simplify()" s/b
"trees[0].node['SEM'].simplify()"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 405-406
Examples 5-7 |
Please replace all seven occurrences of
"nltk.ApplicationExpression" with "nltk.sem.ApplicationExpression".
|
Anonymous |
Dec 16, 2009 |
Dec 01, 2009 |
| Printed |
Page 426
top |
Currently reads:
</sense>
<gloss> ... </gloss>
<synset> ... </synset>
</sense>
...
Should be:
</sense>
<sense>
<gloss> ... </gloss>
<synset> ... </synset>
</sense>
...
|
Bruce C. Baker |
Sep 27, 2009 |
Jan 01, 2010 |
| Printed |
Page 429
11-12 lines down |
Sentence beginning with "Ignoring...", please replace with the following (and set "OTH" in cw):
Ignoring the entries for exchanges between people other than the top 5 (labeled OTH), the largest value suggests that Portia and Bassanio have the most significant interactions.
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 444
7 lines down |
can never been known s/b can never be known
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
| Printed |
Page 467
Toward bottom of left hand column |
"deve-test" -> "dev-test"
|
Anonymous |
Dec 16, 2009 |
Jan 01, 2010 |
|