Errata

Designing Data-Intensive Applications

Errata for Designing Data-Intensive Applications

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".

The following errata were submitted by our customers and approved as valid errors by the author or editor.

Color key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted By Date submitted Date corrected
ch04
"Dynamically Generated Schemas", 2nd paragraph

In the text below:
[...] problems with textual formats (JSON, CSV, SQL)

"SQL" is obviously not a textual format. In the context, the author was probably referring to "XML".

The resulting fixed line would be:
[...] problems with textual formats (JSON, CSV, XML)

Note from the Author or Editor:
Erratum is correct, I have corrected the text in Atlas

Punleuk Oum  Apr 30, 2018  Jun 01, 2018
ch04
"code generation and dynamically typed languages", 3rd paragraph

"[...] code generation is an unnecessarily obstacle to getting to the data."
->
"[...] code generation is an unnecessary obstacle to getting to the data."

Note from the Author or Editor:
I have made this change in Atlas

Punleuk Oum  Apr 30, 2018  Jun 01, 2018
ch 6
references

Reference [11] Andrew Wang: “Windows Azure Storage,” umbrant.com, February 4, 2016. should link to https://www.umbrant.com/2016/02/04/windows-azure-storage/

Note from the Author or Editor:
URL of the blog post has changed. We're updating it to https://www.umbrant.com/2016/02/04/windows-azure-storage/

David Waller  Oct 01, 2018  Nov 21, 2018
Ch 11
references

Reference [18] Jay Kreps, Neha Narkhede, and Jun Rao: “Kafka: A Distributed Messaging System for Log Processing,” is no longer available at that URL. Suggested alternative: https://www.microsoft.com/en-us/research/wp-content/uploads/2017/09/Kafka.pdf

Note from the Author or Editor:
I have updated the URL in Atlas and on https://github.com/ept/ddia-references

David Waller  Nov 30, 2018  Mar 15, 2019
Page Chapter 3
p. 71, 75

At various places in Chapter 3, the book talks about appending to file being very efficient as compared to in-place update only.
p. 71: "... appending to a file is generally very efficient."
p. 71 "It's hard to beat the performance of simply appending to a file, ..."
p. 75 "Appending ... are sequential write operations."

However, the chapter fails to explain the reason why appending is fast. I was thinking that both appending and updating in-place would need single disk-seek. So, both should take same time. On further research, I found that appending to file is efficient because OS would buffer multiple appends and then later write a lot of data at once. It would be helpful to mention this point. Also, good to mention that such buffering would impact durability of data that is in buffer, but not yet written to disk.

Note from the Author or Editor:
Correct: when there are many append operations, their data is consecutive in the file, and so they can be written out sequentially in fewer I/O operations than if the writes are scattered around many different locations on disk. We will clarify this in the second edition.

Nehchal Jindal  Mar 03, 2023 
PDF
Page x
Top



New types of database [system] (“NoSQL”)
have been getting lots of attention, but message queues, caches, search indexes, frameworks

s/b systems

Note from the Author or Editor:
Fixed in next Early Release update.

Anonymous  Aug 10, 2015  Mar 01, 2017
Chapter 1

In this Chapter 1, we will start by exploring the fundamentals of what we are trying to achieve: reliabile, scalabile and maintainable data systems

reliabile -> reliable
scalabile -> scalable

Note from the Author or Editor:
Fixed in next Early Release update.

Sascha Gottfried  Sep 23, 2015  Mar 01, 2017
Ch 4
Chapter 4, section CODE GENERATION AND DYNAMICALLY TYPED LANGUAGES

compilation is written with a typo as compliation.

Code generation is often frowned upon in these languages, since they otherwise avoid an explicit compliation step.

Note from the Author or Editor:
Fixed in next early release update.

Philippe Derome  May 23, 2016  Mar 01, 2017
Ch 4

following choice of words feels awkward (unpack): In the rest of this chapter we will unpack some of the most common ways how data flows between processes:

It would seem that reveal, show,or describe would be a more common fit than unpack.

Note from the Author or Editor:
Fixed in next early release update.

Philippe Derome  May 23, 2016  Mar 01, 2017
5
Chapter 8

In the sub-section "The truth is defined by the majority" of section "Knowledge, Truth and Lies", a typo in the paragraph below figure 8-5:

However, the storage server rembers that it has already processed a write with a higher token number (34), and so it rejects the request with token 33.

"rembers " -> "remembers"?

Note from the Author or Editor:
Fixed in next early release update.

Anonymous  Nov 10, 2016  Mar 01, 2017
PDF
Page 7
2nd Paragraph

"forexample, randomly killing individual processes without warning — is known as chaosmonkey "

I don't think that it is correct to use 'chaos monkey' as an umbrella term in this context, chaos monkey is a software application developed to preform that task not a term with that meaning.

Note from the Author or Editor:
Fixed in next Early Release update.

blankenshipz  Jun 30, 2015  Mar 01, 2017
PDF
Page 15
Box 'Percentiles in Practice', First paragraph, last sentence

"As it takes just one slow call to make the entire end-user request slow, rare slow calls to the backend become much more frequent at the end-user request level" should probably be reworded.

1. Saying "the backend" can be misleading, and is technically inaccurate. It's *multiple* backends. It's only with multiple backends that this increase in frequency of slow end-user requests makes sense.

2. It would be more technically accurate and better to say that it is the *collective frequency* of any slow call to any backend that increases. The slow calls themselves do not exactly increase in frequency.

Jeffrey 'jf' Lim  Dec 19, 2015  Mar 01, 2017
PDF
Page 26
2nd paragraph

"data model" should be pluralized in 'There are many different kinds of data model' (should be 'There are many different kinds of data models')

Jeffrey 'jf' Lim  Dec 19, 2015  Mar 01, 2017
Printed
Page 50
1st paragraph

"Besides these, there are also imperative graph query languages such as Gremlin..."

I believe that Gremlin supports both imperative and declarative traversals. The wikipedia page is actually a useful reference here: https://en.wikipedia.org/wiki/Gremlin_(programming_language)

Note from the Author or Editor:
Correct, the declarative features seem to have been added to Gremlin since I last looked at it. I will reword this sentence to avoid the confusion.

Jeff Carpenter  Dec 19, 2017  Mar 16, 2018
53
Designing Data-Intensive Applications Chapter 2. The Battle of the Data Models

... where the the database ...

better written with just one 'the'

Note from the Author or Editor:
Fixed in next Early Release update.

Klaus Ita  Aug 06, 2015  Mar 01, 2017
PDF
Page 53
last paragraph

The following sentence has a typo:

"In a graph database, there are is no such restriction: any vertex can have an edge to any other vertex."

It should be:

"In a graph database, there is no such restriction: any vertex can have an edge to any other vertex."

Slavcho Slavchev  Jan 23, 2016  Mar 01, 2017
PDF
Page 73
last paragraph of this page

"The merging and compaction of frozen segments can be done in a background thread...continue to serve read and write requests as normal, using the old segment files"
what does frozen mean is a little vague. How to write old segment files when it has been frozen or close?

Note from the Author or Editor:
Rephrasing this sentence to be clearer in the next update.

yuxh  May 04, 2019  Aug 09, 2019
PDF
Page 76
2nd-3rd paragraphs

TLDR for the below comments: The second and third paragraphs downplay the differences with Bitcask, which was pretty confusing to me at first.

Unless I'm mistaken, this sentence ("We also require that each key only appears once within each merged segment file (the compaction process already ensures that).") would be more accurately/helpfully written ("We also require that each key only appears once within each segment file. Incoming keys are consolidated by a tree structure that we will discuss shortly.")

This resolves my first confusion, because I thought you were implying that segment files could have multiple entries per key until they were merged.

Also, the sentence "At first glance, that requirement seems to break our ability to use sequential writes, but we’ll get to that in a moment" might be more helpfully written "This means that we cannot append new keys directly to the segments (as in Bitcask) because we can only have one entry per key/segment. However, creation of new segment files is still performed using sequential writes, as we will show in a moment."

This resolves my second confusion, because I thought you were implying that you would show how all new data could still be written directly to the segments.

Note from the Author or Editor:
Thanks for the suggested wording improvements. In the next update of the book I will tweak the wording to avoid this confusion.

Stephen Dewey  Sep 10, 2018  Nov 21, 2018
82
3rd paragraph from the bottom

Swagger is mentioned as a RESTful APIs description language, however this information is not exactly correct nor full. Swagger is an old name (since Nov 5 2015, however still used informally). The current official name is "OpenAPI"(https://www.openapis.org). Swagger is an API documentation tool and even though it is designed for RESTful APIs, it is also used as an interactive documentation tool for other types of HTTP APIs.
Moreover Swagger/Open API is not the only tool for API documentation and design. Other popular tools include:
* RAML (http://raml.org/)
* API Blueprint (https://apiblueprint.org)

Note from the Author or Editor:
Making an appropriate change in QC1 review, to be included in QC2.

Andrzej Jarzyna  Feb 03, 2017  Mar 01, 2017
PDF
Page 87
3rd paragraph

"increasinly" should be "increasingly"

Note from the Author or Editor:
Fixed in next Early Release update.

Greg Nofi  Nov 05, 2015  Mar 01, 2017
Page 107 of 802
The first paragraph under "B-Trees" heading

This paragraph (and elsewhere in the book) uses the term "log-structured indexes". I initially found this term a bit confusing, as log-structured storage engines often use an in-memory index - ie the index isn't log-structured itself, it is only applied to a log-structured database. IIUC, you perhaps mean "indices used by long-structured storage engines" instead.

Note from the Author or Editor:
Thanks for highlighting this. I actually meant not the index used internally by a log-structured storage engine, but rather a database index that is implemented using an LSM-tree (as opposed to B-tree) approach. I will reword this in a future revision to avoid this point of confusion.

Apurva Chitnis  Nov 23, 2021 
PDF
Page 107
line 5

Is it right?

"From time to time to time", I think it is mis typo of "From time to time"

Note from the Author or Editor:
Remove spurious "to time".

DaeMyung Kang  Dec 12, 2014  Mar 01, 2017
PDF
Page 137
Under heading

There are two common ways how data is distributed across multiple nodes:

/del "how"

Note from the Author or Editor:
Fixed in next Early Release update.

Anonymous  Aug 10, 2015  Mar 01, 2017
PDF
Page 138
"Distributed actor frameworks" section

The "Distributed Actor Frameworks" section is missing important background information. It doesn't really describe why we would want to use such a framework, and it doesn't explain how the frameworks can still be useful despite the potential for lost messages. To make this section useful, I think it would be worth adding a paragraph or two to address these points.

Note from the Author or Editor:
[No change in this edition]

We have noted this suggestion and will take it into account when preparing a second edition of the book.

Stephen Dewey  Sep 17, 2018  Nov 21, 2018
PDF
Page 190-191
Final paragraph ("However, if you want to allow...")

It would help to address how tombstones help with deletes during concurrent writes (not just how it helps with cleaning up siblings after the fact). In the shopping cart example, if the 4th write was "delete milk, delete eggs, add ham" and a tombstone was added indicating that milk and eggs were deleted at version 4, you would still have milk and eggs coming back in the next write at version 5 (based on version 3).

The question then is whether the database assumes that milk and eggs were only included in version 5 because they were part of version 3 (in which case it could delete them now) or whether the database assumes the user was reaffirming that they wanted milk and eggs (in which case the new write should overwrite the tombstone). It doesn't seem like there's an easy answer because there isn't enough information to really know what the intent was.

Note from the Author or Editor:
[No change in this edition]

We have noted this suggestion and will take it into account when preparing a second edition of the book.

Stephen Dewey  Oct 15, 2018  Nov 21, 2018
Page 195
Ref [28]

The referenced blog post by Robert Hodges has been published on April 30, 2012 but the text reads March instead.

Note from the Author or Editor:
Correct. Eagle-eyed observation!

Lucio Assis  Apr 12, 2023 
Printed
Page 202
2nd paragraph

After figure 6-2, the text states that Volume 12 of the pictured encyclopedia (Trudeau - Zywiec) contains "words starting with T, U, V, X, Y, and Z." However, assuming that the encyclopedia uses the English alphabet, it would also contain words starting with W.

Milo Price  Dec 28, 2017  Mar 16, 2018
Printed
Page 203
5th paragraph

Book states "Cassandra and MongoDB use MD5", Cassandra uses murmur3 hashing though.

Note from the Author or Editor:
Cassandra prior to version 1.2 used MD5, and version 1.2 switched to using Murmur3 by default. I will clarify this in the text.

Ulf Gitschthaler  Jun 26, 2017  Mar 16, 2018
ePub
Page 222
2nd paragraph

"each partitions maintains..." should be "each partition maintains..."

Note from the Author or Editor:
Fixed in next Early Release update.

Anonymous  Oct 29, 2015  Mar 01, 2017
Printed
Page 227
Citation 19

Re: SSDs losing power in just weeks in unusual temps.

The citation does say this, but itself refers to a presentation slide that JEDEC has called misunderstood: https://www.jedec.org/news/pressreleases/jedec-update-solid-state-drive-standard

While the point is certainly valid that SSDs can lose data in storage, the very short time frames given are talking about EOL'ed enterprise drives. Perhaps a footnote would help for expanding on this alarming statistic.

Excellent book by the way, really enjoying it!

Note from the Author or Editor:
Thank you for pointing this out; I will update the wording to clarify this point in the next update of the book.

Corey Sciuto  Aug 26, 2018  Nov 21, 2018
PDF
Page 241
full page prior to "Indexes and snapshot isolation"

If you have the time, I'm wondering if you can shed some light on this. I found the discussion in this section to be very confusing.

I think the source of my confusion is that you haven't explained how, if at all, uncommitted rows are kept separate from the list of committed rows in the object version lists that you have shown. Reading between the lines I think the answer is that they are NOT kept separate at all, so an uncommitted write goes immediately into the same list as committed transactions list. Is that true?

That would explain why in rule #1 on page 241 you say writes by transactions which were running at the beginning a snapshot transaction are ignored "even if" any of those writes commits. You say "even if" because a transaction could also see the uncommitted writes of earlier transactions that are still running. It needs the list of "transactions that were running when I started" to know that either of the following is true: 1) this transaction is still running and was running when I started, 2) this transaction has committed or aborted (but isn't cleaned up yet) and was running when I started. In both cases it ignores the row.

#2 is also confusing because it seems like a superfluous rule after #1 and #3. Does a transaction need some mechanism to determine "rows from aborted transactions" in addition to rules 1+3? The only way I can think to resolve this is that the assumption in my second paragraph is correct (uncommitted rows are not kept separate) and that additionally, it takes some time to clear aborted uncommitted rows from the object version list (and to unmark objects as deleted which were deleted by aborted transactions). Therefore the transaction needs some second list of "aborted transactions which were not cleared when I started" so it can know to ignore them. Is this true?

The two paragraphs on page 239 from "To implement snapshot isolation" ending in "for an entire transaction" are also pretty confusing. In the first paragraph you say that it's a generalization of the earlier mechanism (which wasn't fully explained, you just said that "any writes by a transaction only become visible to others when that transaction commits"). But then in the second paragraph you imply that MVCC is effectively a distinct mechanism. But the bigger confusion is with that final sentence ("A typical approach"). Why would it make sense to ever base MVCC on a single query? Even read committed is done at the transaction level, not the query level.

Thanks in advance for any clarification you can provide.








Note from the Author or Editor:
[No change in this edition]

We have noted this suggestion and will take it into account when preparing a second edition of the book.

Stephen Dewey  Sep 28, 2018  Nov 21, 2018
PDF
Page 242
first two paragraphs

Similar to my earlier question, I think a key missing piece of information here is where this alternative approach places uncommitted writes. It is really hard to understand how this approach is meant to work without knowing that.

Also, I think you probably meant to put these two paragraphs in their own section, because they don't have anything to do with the last header ("Indexes and snapshot isolation").

Note from the Author or Editor:
[No change in this edition]

We have noted this suggestion and will take it into account when preparing a second edition of the book.

The section structure is correct. The last two paragraphs describe the copy-on-write approach to maintaining B-tree indexes, which can help with implementing snapshot isolation by using an old B-tree root as the snapshot from which a transaction reads. We will try to make this clearer in the second edition.

Stephen Dewey  Sep 28, 2018  Nov 21, 2018
Printed
Page 249
Entire page

Page 349 appears instead of page 249 on page 249. There is no page 249 content to be found in the book.

Page 349 displays correctly, though is duplicated in two places as a result.

Note from the Author or Editor:
this was a printer error in the 4th printing, but has been corrected since then (6th printing was March 2019).

Anonymous  Jul 01, 2019  Mar 15, 2019
Printed
Page 253
2nd paragraph

Under the billeted list outlining developments that caused a rethink:

RAM became cheap enough that for many use cases is now feasible to keep....

"is now" should read "it is" or "it's".

Simon McClive  Apr 15, 2017  Mar 16, 2018
PDF
Page 257
second bullet point in the middle

You refer to figure 7-1, but figure 7-1 doesn't portray a case of "reading an old version of an object" as you say. Both reads in that figure happen before any writes occur. Perhaps you meant to refer to a different figure.

Also on page 258, remove the "a" from before the word "having".

Note from the Author or Editor:
Changing the reference to Figure 7-4 instead of Figure 7-1.

Stephen Dewey  Oct 03, 2018  Nov 21, 2018
PDF
Page 281
3rd paragraph

"packed-switched" in "Ethernet and IP are packed-switched protocols" should be "packet-switched"

Note from the Author or Editor:
Will be fixed in QC1

Krzysztof Sobusiak  Jan 02, 2017  Mar 01, 2017
PDF
Page 288
3rd-to-last paragraph

"These jumps, as well as the fact that they often ignore leap seconds, make time-of-day clocks unsuitable for measuring elapsed time"

Based on the reference you linked, it seems the CloudFlare problem was actually that the clock used by its code DID take leap seconds into account, but the application code ignored the fact that this could happen.

So perhaps a better phrasing would be:

"These jumps, as well as similar jumps caused by leap seconds, make time-of-day clocks unsuitable for measuring elapsed time"

In other words the problem isn't that time-of-day clocks ignore leap seconds, it's the reverse, that they are affected by them. But then the application code ignores the fact that this can happen.

Note from the Author or Editor:
I agree with the suggested wording change, and have updated the text in Atlas.

Stephen  Dec 03, 2018  Mar 15, 2019
PDF
Page 293
(Sixth Early Release) Ch8, The leader and the lock, 2nd paragraph



Minor problems with plurals:

...even if a nodes believes that it is... [change 'nodes' to 'node']

... mean the majority of nodes agrees! ... [change 'agrees' to 'agree']

Note from the Author or Editor:
Fixed in next early release update.

Ross  Aug 13, 2016  Mar 01, 2017
PDF
Page 302
(Sixth Early Release) Ch8, Summary, 4th paragraph

The wording feels a bit awkward in - 'The only way how information can flow...'

Perhaps drop 'how'?
'The only way information can flow ...'

Note from the Author or Editor:
Fixed in next early release update.

Ross  Aug 13, 2016  Mar 01, 2017
Page 305
last paragraph in the middle at the start of the (

it says "i.e. if you have four nodes" but this is an example so it should be "e.g."

Note from the Author or Editor:
Fixed.

Megan Cutrofello  May 26, 2022 
PDF
Page 317
4th paragraph

"...are easier use correctly" should be "...are easier to use correctly"

Note from the Author or Editor:
Fixed in copyedit

Krzysztof Sobusiak  Jan 05, 2017  Mar 01, 2017
Printed
Page 322
2nd full paragraph, 2nd sentence

Unnecessary repeat of word 'first' in same sentence, keeping the first one and suppressing the second one would do:

But first we first need to explore the range of guarantees...

Philippe Derome  Apr 25, 2017  Mar 16, 2018
Page 358
1st paragraph of "Coordinator failure" section

in the sentence "if any of the prepare requests fail or time out" these verbs should agree with "any" so it should be "if any...fails or times out" (fail -> fails and time -> times), also the next clause needs fail -> fails as well

p.s. PLEASE make a form where I can submit multiple errata at once

Note from the Author or Editor:
Fixed.

Megan Cutrofello  May 26, 2022 
Page 405
Sort merge diagram

In the sort-merge explanation diagram (under ‘Reducer partition 1’), the reducer function is shown to output pairs of (url, dob) but according to the description in paragraph 2, p.406, it should be pairs of (viewed-url, viewer-age-in-years), am I understanding this correctly? Thank you

Note from the Author or Editor:
Thanks for pointing out this inconsistency. We'll fix it in the second edition.

Padraig Cleary  Aug 05, 2024 
Page 418
Last paragraph

The final paragraph in this section talks about general priority preemption in open-source schedulers (with the caveat "as of this writing", so this errata is mostly addressing that a few things have changed). Pod preemption has existed in Kubernetes since ~2019, see "Pod Priority and Preemption" and "Node Pressure Eviction" in the Kubernetes docs.

Other open-source schedulers like Nomad have also included task/job-based preemption as top-level concern since ~2019.

Note from the Author or Editor:
Correct, this has changed since the first edition came out in 2017. It will be corrected in the second edition

Taylor Chaparro  Jul 02, 2023 
Page 450
first paragraph

you say "We'll discuss a more sophisticated way of freeing disk space later" without a specific section and I think this might be the only time in the entire book that you do something like this so it stood out to me (it's "Log compaction" on p.456)

Note from the Author or Editor:
Adding a cross-reference as suggested.

Megan Cutrofello  May 26, 2022 
ePub
Page 452
Chapter 8, Figured 8-1

Figure 8-1 seems to be a duplicate of Figure 2-4, and does not match the description of what 8-1 is trying to communicate.

Donald Kjer  Jan 24, 2016  Mar 01, 2017
Page 474
3rd complete paragraph (4th paragraph including the partial one)

"The difference to batch jobs is..." - this is ok in British English but not really in American English, and "the difference from" is ok in both so I'd change it, "to" -> "from"

Note from the Author or Editor:
Fixed.

Megan Cutrofello  May 26, 2022 
Page 510
Bullet point under 2nd paragraph

This is regarding the section explaining what cas(x, v_old, v_new) => r means.

In the penultimate sentence, it mentions "If x ≠ v_old then the operation should leave the register unchanged and return an error.".

Ideally x here is the register and it being equal to v_old or not does not make a difference. What it means is that if the value being held by register x is not equal to v_old, then the register should be left unchanged.

Note from the Author or Editor:
Correct, I was being imprecise with notation here by conflating the register object with its current value. Changing to "If the value of x is different from v_old, then the operation should leave the register unchanged..."

Ankush Sharma  Sep 16, 2023 
Page 541
2nd complete paragraph (3rd including the partial one)

"the personal data it has collected is one of the assets that get sold" -- "gets" should agree with "one" not with "assets" so get -> gets

Note from the Author or Editor:
Fixed.

Megan Cutrofello  May 26, 2022 
Mobi
Page 580
"Because all joins and data dependencies in a workflow..."

Minor insignificant thing, but thought I'd bring it up. Where you say:

"Because all joins and data dependencies in a workflow..."

It begins with an extra space, at least on the Kindle store edition.

Jorge Israel Peña  Feb 17, 2021  Mar 26, 2021
Mobi
Page 3131
text

TYPO:
"commiting the write" should be "committing the write"

Redundancy:
"The blocking of readers and writers is implemented by a having a lock on each object in the database." Should be: "implemented by having a lock on "

Note from the Author or Editor:
Fixed in Atlas.

Anonymous  Dec 16, 2019  Jan 24, 2020
Mobi
Page 6130

Hard to tell because I use kindle, thus I don't see pages but locations

in location 6130 and the first paragraph you see a repeated "to"

"When a transaction wants to to commit."

Note from the Author or Editor:
Fixed in next Early Release update.

Wilmer Andres Daza Gomez  Aug 26, 2015  Mar 01, 2017
Mobi
Page 10672
throughout

Notes from Amazon
Your book has an external links that do not work. For example at the following locations "1789,2763,2780,2816,5030" and throughout the book. For example "Apache CouchDB 1.6." Documentation. Please update a valid external URL.To ensure future access to reference material, Amazon strongly recommends submitting these types of links to an archive service, and including the archived link in the book. If the link is broken due to forces outside your control, it should be deactivated and “[URL inactive]” should be added following the link text."

Note from the Author or Editor:
I have gone through all URLs in the book and fixed all broken links as of March 2020.

Anonymous  Jan 22, 2020  Mar 27, 2020