Errata
The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".
The following errata were submitted by our customers and approved as valid errors by the author or editor.
Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update
Version | Location | Description | Submitted By | Date submitted | Date corrected |
---|---|---|---|---|---|
ch04 "Dynamically Generated Schemas", 2nd paragraph |
In the text below: Note from the Author or Editor: |
Punleuk Oum | Apr 30, 2018 | Jun 01, 2018 | |
ch04 "code generation and dynamically typed languages", 3rd paragraph |
"[...] code generation is an unnecessarily obstacle to getting to the data." Note from the Author or Editor: |
Punleuk Oum | Apr 30, 2018 | Jun 01, 2018 | |
ch 6 references |
Reference [11] Andrew Wang: “Windows Azure Storage,” umbrant.com, February 4, 2016. should link to https://www.umbrant.com/2016/02/04/windows-azure-storage/ Note from the Author or Editor: |
David Waller | Oct 01, 2018 | Nov 21, 2018 | |
Ch 11 references |
Reference [18] Jay Kreps, Neha Narkhede, and Jun Rao: “Kafka: A Distributed Messaging System for Log Processing,” is no longer available at that URL. Suggested alternative: https://www.microsoft.com/en-us/research/wp-content/uploads/2017/09/Kafka.pdf Note from the Author or Editor: |
David Waller | Nov 30, 2018 | Mar 15, 2019 | |
Page Chapter 3 p. 71, 75 |
At various places in Chapter 3, the book talks about appending to file being very efficient as compared to in-place update only. Note from the Author or Editor: |
Nehchal Jindal | Mar 03, 2023 | ||
Page x Top |
Note from the Author or Editor: |
Anonymous | Aug 10, 2015 | Mar 01, 2017 | |
Chapter 1 |
In this Chapter 1, we will start by exploring the fundamentals of what we are trying to achieve: reliabile, scalabile and maintainable data systems Note from the Author or Editor: |
Sascha Gottfried | Sep 23, 2015 | Mar 01, 2017 | |
Ch 4 Chapter 4, section CODE GENERATION AND DYNAMICALLY TYPED LANGUAGES |
compilation is written with a typo as compliation. Note from the Author or Editor: |
Philippe Derome | May 23, 2016 | Mar 01, 2017 | |
Ch 4 |
following choice of words feels awkward (unpack): In the rest of this chapter we will unpack some of the most common ways how data flows between processes: Note from the Author or Editor: |
Philippe Derome | May 23, 2016 | Mar 01, 2017 | |
5 Chapter 8 |
In the sub-section "The truth is defined by the majority" of section "Knowledge, Truth and Lies", a typo in the paragraph below figure 8-5: Note from the Author or Editor: |
Anonymous | Nov 10, 2016 | Mar 01, 2017 | |
Page 7 2nd Paragraph |
"forexample, randomly killing individual processes without warning — is known as chaosmonkey " Note from the Author or Editor: |
blankenshipz | Jun 30, 2015 | Mar 01, 2017 | |
Page 15 Box 'Percentiles in Practice', First paragraph, last sentence |
"As it takes just one slow call to make the entire end-user request slow, rare slow calls to the backend become much more frequent at the end-user request level" should probably be reworded. |
Jeffrey 'jf' Lim | Dec 19, 2015 | Mar 01, 2017 | |
Page 26 2nd paragraph |
"data model" should be pluralized in 'There are many different kinds of data model' (should be 'There are many different kinds of data models') |
Jeffrey 'jf' Lim | Dec 19, 2015 | Mar 01, 2017 | |
Printed | Page 50 1st paragraph |
"Besides these, there are also imperative graph query languages such as Gremlin..." Note from the Author or Editor: |
Jeff Carpenter | Dec 19, 2017 | Mar 16, 2018 |
53 Designing Data-Intensive Applications Chapter 2. The Battle of the Data Models |
... where the the database ... Note from the Author or Editor: |
Klaus Ita | Aug 06, 2015 | Mar 01, 2017 | |
Page 53 last paragraph |
The following sentence has a typo: |
Slavcho Slavchev | Jan 23, 2016 | Mar 01, 2017 | |
Page 73 last paragraph of this page |
"The merging and compaction of frozen segments can be done in a background thread...continue to serve read and write requests as normal, using the old segment files" Note from the Author or Editor: |
yuxh | May 04, 2019 | Aug 09, 2019 | |
Page 76 2nd-3rd paragraphs |
TLDR for the below comments: The second and third paragraphs downplay the differences with Bitcask, which was pretty confusing to me at first. Note from the Author or Editor: |
Stephen Dewey | Sep 10, 2018 | Nov 21, 2018 | |
82 3rd paragraph from the bottom |
Swagger is mentioned as a RESTful APIs description language, however this information is not exactly correct nor full. Swagger is an old name (since Nov 5 2015, however still used informally). The current official name is "OpenAPI"(https://www.openapis.org). Swagger is an API documentation tool and even though it is designed for RESTful APIs, it is also used as an interactive documentation tool for other types of HTTP APIs. Note from the Author or Editor: |
Andrzej Jarzyna | Feb 03, 2017 | Mar 01, 2017 | |
Page 87 3rd paragraph |
"increasinly" should be "increasingly" Note from the Author or Editor: |
Greg Nofi | Nov 05, 2015 | Mar 01, 2017 | |
Page 107 of 802 The first paragraph under "B-Trees" heading |
This paragraph (and elsewhere in the book) uses the term "log-structured indexes". I initially found this term a bit confusing, as log-structured storage engines often use an in-memory index - ie the index isn't log-structured itself, it is only applied to a log-structured database. IIUC, you perhaps mean "indices used by long-structured storage engines" instead. Note from the Author or Editor: |
Apurva Chitnis | Nov 23, 2021 | ||
Page 107 line 5 |
Is it right? Note from the Author or Editor: |
DaeMyung Kang | Dec 12, 2014 | Mar 01, 2017 | |
Page 137 Under heading |
There are two common ways how data is distributed across multiple nodes: Note from the Author or Editor: |
Anonymous | Aug 10, 2015 | Mar 01, 2017 | |
Page 138 "Distributed actor frameworks" section |
The "Distributed Actor Frameworks" section is missing important background information. It doesn't really describe why we would want to use such a framework, and it doesn't explain how the frameworks can still be useful despite the potential for lost messages. To make this section useful, I think it would be worth adding a paragraph or two to address these points. Note from the Author or Editor: |
Stephen Dewey | Sep 17, 2018 | Nov 21, 2018 | |
Page 190-191 Final paragraph ("However, if you want to allow...") |
It would help to address how tombstones help with deletes during concurrent writes (not just how it helps with cleaning up siblings after the fact). In the shopping cart example, if the 4th write was "delete milk, delete eggs, add ham" and a tombstone was added indicating that milk and eggs were deleted at version 4, you would still have milk and eggs coming back in the next write at version 5 (based on version 3). Note from the Author or Editor: |
Stephen Dewey | Oct 15, 2018 | Nov 21, 2018 | |
Page 195 Ref [28] |
The referenced blog post by Robert Hodges has been published on April 30, 2012 but the text reads March instead. Note from the Author or Editor: |
Lucio Assis | Apr 12, 2023 | ||
Printed | Page 202 2nd paragraph |
After figure 6-2, the text states that Volume 12 of the pictured encyclopedia (Trudeau - Zywiec) contains "words starting with T, U, V, X, Y, and Z." However, assuming that the encyclopedia uses the English alphabet, it would also contain words starting with W. |
Milo Price | Dec 28, 2017 | Mar 16, 2018 |
Printed | Page 203 5th paragraph |
Book states "Cassandra and MongoDB use MD5", Cassandra uses murmur3 hashing though. Note from the Author or Editor: |
Ulf Gitschthaler | Jun 26, 2017 | Mar 16, 2018 |
ePub | Page 222 2nd paragraph |
"each partitions maintains..." should be "each partition maintains..." Note from the Author or Editor: |
Anonymous | Oct 29, 2015 | Mar 01, 2017 |
Printed | Page 227 Citation 19 |
Re: SSDs losing power in just weeks in unusual temps. Note from the Author or Editor: |
Corey Sciuto | Aug 26, 2018 | Nov 21, 2018 |
Page 241 full page prior to "Indexes and snapshot isolation" |
If you have the time, I'm wondering if you can shed some light on this. I found the discussion in this section to be very confusing. Note from the Author or Editor: |
Stephen Dewey | Sep 28, 2018 | Nov 21, 2018 | |
Page 242 first two paragraphs |
Similar to my earlier question, I think a key missing piece of information here is where this alternative approach places uncommitted writes. It is really hard to understand how this approach is meant to work without knowing that. Note from the Author or Editor: |
Stephen Dewey | Sep 28, 2018 | Nov 21, 2018 | |
Printed | Page 249 Entire page |
Page 349 appears instead of page 249 on page 249. There is no page 249 content to be found in the book. Note from the Author or Editor: |
Anonymous | Jul 01, 2019 | Mar 15, 2019 |
Printed | Page 253 2nd paragraph |
Under the billeted list outlining developments that caused a rethink: |
Simon McClive | Apr 15, 2017 | Mar 16, 2018 |
Page 257 second bullet point in the middle |
You refer to figure 7-1, but figure 7-1 doesn't portray a case of "reading an old version of an object" as you say. Both reads in that figure happen before any writes occur. Perhaps you meant to refer to a different figure. Note from the Author or Editor: |
Stephen Dewey | Oct 03, 2018 | Nov 21, 2018 | |
Page 281 3rd paragraph |
"packed-switched" in "Ethernet and IP are packed-switched protocols" should be "packet-switched" Note from the Author or Editor: |
Krzysztof Sobusiak | Jan 02, 2017 | Mar 01, 2017 | |
Page 288 3rd-to-last paragraph |
"These jumps, as well as the fact that they often ignore leap seconds, make time-of-day clocks unsuitable for measuring elapsed time" Note from the Author or Editor: |
Stephen | Dec 03, 2018 | Mar 15, 2019 | |
Page 293 (Sixth Early Release) Ch8, The leader and the lock, 2nd paragraph |
Note from the Author or Editor: |
Ross | Aug 13, 2016 | Mar 01, 2017 | |
Page 302 (Sixth Early Release) Ch8, Summary, 4th paragraph |
The wording feels a bit awkward in - 'The only way how information can flow...' Note from the Author or Editor: |
Ross | Aug 13, 2016 | Mar 01, 2017 | |
Page 305 last paragraph in the middle at the start of the ( |
it says "i.e. if you have four nodes" but this is an example so it should be "e.g." Note from the Author or Editor: |
Megan Cutrofello | May 26, 2022 | ||
Page 317 4th paragraph |
"...are easier use correctly" should be "...are easier to use correctly" Note from the Author or Editor: |
Krzysztof Sobusiak | Jan 05, 2017 | Mar 01, 2017 | |
Printed | Page 322 2nd full paragraph, 2nd sentence |
Unnecessary repeat of word 'first' in same sentence, keeping the first one and suppressing the second one would do: |
Philippe Derome | Apr 25, 2017 | Mar 16, 2018 |
Page 358 1st paragraph of "Coordinator failure" section |
in the sentence "if any of the prepare requests fail or time out" these verbs should agree with "any" so it should be "if any...fails or times out" (fail -> fails and time -> times), also the next clause needs fail -> fails as well Note from the Author or Editor: |
Megan Cutrofello | May 26, 2022 | ||
Page 405 Sort merge diagram |
In the sort-merge explanation diagram (under ‘Reducer partition 1’), the reducer function is shown to output pairs of (url, dob) but according to the description in paragraph 2, p.406, it should be pairs of (viewed-url, viewer-age-in-years), am I understanding this correctly? Thank you Note from the Author or Editor: |
Padraig Cleary | Aug 05, 2024 | ||
Page 418 Last paragraph |
The final paragraph in this section talks about general priority preemption in open-source schedulers (with the caveat "as of this writing", so this errata is mostly addressing that a few things have changed). Pod preemption has existed in Kubernetes since ~2019, see "Pod Priority and Preemption" and "Node Pressure Eviction" in the Kubernetes docs. Note from the Author or Editor: |
Taylor Chaparro | Jul 02, 2023 | ||
Page 450 first paragraph |
you say "We'll discuss a more sophisticated way of freeing disk space later" without a specific section and I think this might be the only time in the entire book that you do something like this so it stood out to me (it's "Log compaction" on p.456) Note from the Author or Editor: |
Megan Cutrofello | May 26, 2022 | ||
ePub | Page 452 Chapter 8, Figured 8-1 |
Figure 8-1 seems to be a duplicate of Figure 2-4, and does not match the description of what 8-1 is trying to communicate. |
Donald Kjer | Jan 24, 2016 | Mar 01, 2017 |
Page 474 3rd complete paragraph (4th paragraph including the partial one) |
"The difference to batch jobs is..." - this is ok in British English but not really in American English, and "the difference from" is ok in both so I'd change it, "to" -> "from" Note from the Author or Editor: |
Megan Cutrofello | May 26, 2022 | ||
Page 510 Bullet point under 2nd paragraph |
This is regarding the section explaining what cas(x, v_old, v_new) => r means. Note from the Author or Editor: |
Ankush Sharma | Sep 16, 2023 | ||
Page 541 2nd complete paragraph (3rd including the partial one) |
"the personal data it has collected is one of the assets that get sold" -- "gets" should agree with "one" not with "assets" so get -> gets Note from the Author or Editor: |
Megan Cutrofello | May 26, 2022 | ||
Mobi | Page 580 "Because all joins and data dependencies in a workflow..." |
Minor insignificant thing, but thought I'd bring it up. Where you say: |
Jorge Israel Peña | Feb 17, 2021 | Mar 26, 2021 |
Mobi | Page 3131 text |
TYPO: Note from the Author or Editor: |
Anonymous | Dec 16, 2019 | Jan 24, 2020 |
Mobi | Page 6130 |
Hard to tell because I use kindle, thus I don't see pages but locations Note from the Author or Editor: |
Wilmer Andres Daza Gomez | Aug 26, 2015 | Mar 01, 2017 |
Mobi | Page 10672 throughout |
Notes from Amazon Note from the Author or Editor: |
Anonymous | Jan 22, 2020 | Mar 27, 2020 |