Errata

Hadoop: The Definitive Guide

Errata for Hadoop: The Definitive Guide

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted By Date submitted Date corrected
Printed
Page 10
paragraph 3

"timem" should be "time"

Chuck Newman  Aug 26, 2009 
Printed
Page 11
2nd paragraph

1st line. "we starting" should be "we started"

StephanB  Feb 18, 2010 
Printed
Page 11
line 11

a complete redesign to scale up further.

It is a subtle language problem: it should say scale out (to 200 machines) instead of scale up.

Gianmarco De Francisci Morales  Mar 04, 2010 
Printed
Page 12
"October 2008" bullet

"per a day" should be either "per day" or "a day"

Chuck Newman  Aug 26, 2009 
Printed
Page 36
In Example 2.12

The code shown (Maximum temperature in C++) does not compile as shown on my Ubuntu cluster.

I need to add

#include <stdint.h>

To get rid of an error about uint64_t not being defined in one of the hadoop files. I run hadoop 0.19.2.

Dominique Thiebaut  Apr 13, 2010 
Printed
Page 44
between 2nd and 3rd paragraph

the command should be

% hadoop fsck / -files -blocks

It is currently missing the slash

Dominique Thiebaut  Dec 03, 2009 
45
1st paragraph

Misplaced comma:

"However, the state of the secondary namenode lags that of the primary, so in the event of total failure of the primary data, loss is almost guaranteed."

should be

"However, the state of the secondary namenode lags that of the primary, so in the event of total failure of the primary, data loss is almost guaranteed."

Vineeth Chandran P  Dec 01, 2009 
Printed
Page 65
Figure 3-2

The illustration in Figure 3-2 does not match the description given in the sidebox "Network Topology and Hadoop".

1. For d1/r2, the node is mislabelled as n1 (should be n3)
2. For d2, racks are mislabelled as 'r1' and 'rack' (should be 'r3' and 'r4')
3. For d2/r3, the node is mislabelled as n1 (should be n4)

Note from the Author or Editor:
I agree with all the corrections except for 'rack' which is there to indicate that this entity is a rack, so it doesn't need relabelling 'r4' (since r4 is never referred to in the text).

KS Chow  Sep 01, 2009 
Printed
Page 140
3rd paragraph

Paragraph 3, line 5, word 1:
Reference to 'task_200811201130_0054_m_000000' does not exist, it should refer to 'task200904110811_0003_m_000044'.

Note from the Author or Editor:
It should refer to 'task_200904110811_0003_m_000044'.

kchow8  Sep 10, 2009 
Printed
Page 146
Table 5-3, row 1, col 2, 1st sentence

"How long are you mappers running for?"
Should be:
"How long are your mappers running for?"

KS Chow  Sep 18, 2009 
Printed
Page 181
6th paragraph

At the end of the first sentence in the sidebar: "new users to Hadoop.Almost" there is a missing space after the period.

Tom White
Tom White
 
Sep 11, 2009 
Printed
Page 202
1st paragraph under heading "Text Output"

Heading "Text Output", paragraph 1, line 5, word 3 -> TextOuputFormat

Note from the Author or Editor:
"TextOuputFormat" should read "TextOutputFormat" (it is missing a "t")

kchow8  Sep 27, 2009 
Printed
Page 246
"Storage" heading

"41 TB SATA disks" should read "4 x 1TB SATA disks"

Tom White
Tom White
 
Sep 11, 2009 
Printed
Page 255
2nd paragraph after Table 9-2

Excerpt: "...wanted to run 2 processes on each processor, then you should set mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.map.tasks.maximum to both be 7..."

The second reference to mapred.tasktracker.map.tasks.maximum is incorrect, it should be mapred.tasktracker.reduce.tasks.maximum.

kchow8  Nov 17, 2009 
Printed
Page 268
middle of page

In the sentence " and each map generates(approximately) 10GB of random
binary data, " The "10GB" should be "1GB".

Anonymous  Nov 30, 2011 
Printed
Page 269
last but one para

"that make is easy" should be "that makes it easy"

Sandeep Deshmukh  Oct 06, 2010 
Printed
Page 298
4th paragraph

The directory hierarchy is misaligned.

The line consisting of "/previous/VERSION" should have its first "/" character vertically aligned with the first "/" of "${dfs.name.dir}/current/VERSION".

The first "/" character of each of the following lines ("/edits", "/fsimage", "/fstime") should align with the second "/" of "/previous/VERSION".

Tom White
Tom White
 
Sep 11, 2009 
Printed
Page 350
footnote

hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWriter 1
should be
hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1
(no 'r' at the end of sequentialWrite)

Douglas Wind  Mar 11, 2010 
Printed
Page 412
Table 14-9

Table 14-9, 3rd row (223), 1st col (#listeners): 2
Should be 1.

kchow8  Dec 01, 2009 
Printed
Page 454
Bullet 2 and 3 for Example 14-2

Description for bullet item 2 and 3 don't seem to match the code listing in Example 14-2. Bullet item 2 seems to refer to label 3 and bullet item 3 seems to refer to label 2.

kchow8  Dec 08, 2009 
Printed
Page 503
1st paragraph

Sentence ending "Leeds, UK" is missing a period.

Tom White
Tom White
 
Sep 11, 2009