The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".
The following errata were submitted by our customers and approved as valid errors by the author or editor.
Version |
Location |
Description |
Submitted By |
Date submitted |
Date corrected |
|
Page Chapter 2
py-spy command |
It seems that there should be a top command between py-spy and --pid. I tested with the most recent version.
Note from the Author or Editor: A single word change is needed. At the top of page 55 in console font I see
"$ sudo env "PATH=$PATH" py-spy --pid 15953"
and it should have "top" added to read:
"$ sudo env "PATH=$PATH" py-spy top --pid 15953"
|
chen yuanyuan |
Jul 12, 2020 |
Jan 27, 2023 |
|
Page Chapter 11
Regarding the Wikipedia texts |
I find Error:
FileNotFoundError: [Errno 2] No such file or directory: 'all_unique_words_wikipedia_via_gensim.txt' When trying to load text_example_clean_list_wikipedia_gensim.py and text_example.py
I don't find 'all_unique_words_wikipedia_via_gensim.txt'.
I don't know where to download it, and its not in the supplemental files. Can you clarify if we need to create it ourselves, or we just simply are missing the file?
Note from the Author or Editor: This code currently isn't in the supplemental github repository. I will try to address this and add it in the near future!
|
Tye Lokka |
Feb 09, 2022 |
Jan 27, 2023 |
|
Page Page 18, Chapter 1
Last sentence in the 10th and last paragraph on the page. |
Sentence reads: Both are sensible solutions and are significantly better using the operating system's global Python environment!
Does this sentence mean "... better [when] using ..." or does it mean "... better [than] using ..."?
Note from the Author or Editor: Please add THAN in the indicated place:
"Both are sensible solutions and are significantly better THAN using the operating system's global Python environment!"
|
Jackson Smith |
Jun 12, 2023 |
|
|
I
chapter 1 |
"less than two cores" -> "fewer than two cores"
|
Zachary Kneupper |
Jan 06, 2020 |
Apr 30, 2020 |
|
I
I |
In chapter 1, the text describing the check_prime(number) function refers to a variable called "number_float", but the "number_float" variable does not appear in the function definition.
|
Zachary Kneupper |
Jan 08, 2020 |
Apr 30, 2020 |
|
I
Chapter 1, "So Why Use Python?" |
"Continuum’s Anaconda, a scientifically focused environment" should be edited, since Continuum Analytics has changed their name to Anaconda, Inc.
See:
https://www.anaconda.com/continuum-analytics-officially-becomes-anaconda/
|
Zachary Kneupper |
Jan 09, 2020 |
Apr 30, 2020 |
|
Page 33
paragraph below time command run |
"If you try time --verbose quick-and dirty get an error ..."
seems to be missing "and" before "get"
Note from the Author or Editor: The text currently reads "...If you try `time --verbose` quick-and-direct get an error..." and it should read "...If you try `time --verbose` and you get an error...".
|
Gregory Sherman |
Apr 30, 2021 |
Jan 27, 2023 |
|
Page 37-39
cProfile command and results |
$ python -m cProfile -o profile.stats julia1.py
The resulting information obtained in IPython is from profiling a different program - julia1_nopil.py (Both are present in the downloaded code for Chapter 2)
Also, there is a draw_output argument to calc_pure_python() in julia1_nopil.py, but it is not used in the function.
Note from the Author or Editor: Page 37 the code line "$ python -m cProfile -o profile.stats julia1.py" should now read
"$ python -m cProfile -o profile.stats julia1_nopil.py"
I have modified the julia1_nopil.py code in the public github repository to remove the redundant keyword, so the rest of the printed code samples will work as expected.
|
Gregory Sherman |
Apr 30, 2021 |
Jan 27, 2023 |
|
Page 74
Example 3-5 |
M = (N >> 3) + (3 if N < 9 else 6)
N 0 1-4 5-8 ...
M 0 4 8 ...
=================================================
I believe that both the equation and N to M table are incorrect.
The equation and M values resemble what I found at bit.ly/3xIEJ2L
The equation in the text would produce nonsensical results in some cases, such as M=3 if N=4.
Likewise, the table sometimes has M equal to N, when it is clearly stated on pg. 73 that "M > N"
Note from the Author or Editor: Good eye! The equation should read:
M = N + (N >> 3) + (3 if N < 9 else 6)
The way it is now, M is simply the amount of over-allocation, not the total re-allocated size.
You can see this equation in action here:
https://github.com/python/cpython/blob/3.7/Objects/listobject.c#L59
|
Gregory Sherman |
May 02, 2021 |
Jan 27, 2023 |
|
Page 75
End of page and the beginning of the next page |
The example in the page appends i * 2. It should be i * i. Therefore, the result on the next page is contradict against the conclusion on the next page.
Note from the Author or Editor: Good catch. For consistency, Example 3-7 should use "i * 2" everywhere instead of "i*i". This choice is arbitrary and it could have been the other way, however if we let i get very big i*i has a chance to overflow!
|
Fu Chen |
May 18, 2022 |
Jan 27, 2023 |
|
Page 77
sentence above Example 3-8 |
"... instantiating a list can be 5.1x slower than instantiating a tuple ..."
In the example, 95 ns / 12.5 ns = 7.6
Also worth noting is that on my Windows 10 PC, the factor varies widely between tests
from about 2.7 to 3.9
Note from the Author or Editor: Good eye! Indeed the text should say "7.6x slower".
The exact number, however, should be taken with a grain of salt given that we are doing micro-profiling. It does, however, give the general sense of what to expect for larger lists/tuples.
|
Gregory Sherman |
May 02, 2021 |
Jan 27, 2023 |
|
Page 97
Bottom of page |
Misleading statement: when you explain where generators come into play and how they avoid the creation of all elements at once, the reader might take away that range is a generator, which is not the case (see e.g. question 13092267 on stackoverflow, since I can't post URLs here). You might want to talk about laziness instead.
Note from the Author or Editor: This is a subtlety that we will address in the next edition, we suggest no changes to this edition as the truth is close enough (for most people) and a longer explanation is needed for fine clarification
|
Anthony Labarre |
Jun 29, 2023 |
|
|
Page 102
code at top of page |
The first Fibonacci number - according the modern definition - is zero,
so "yield j" should be replaced with "yield i"
Note from the Author or Editor: Good eye! Indeed, "yield j" should be "yield i" in the code snipped spanning page 101-102.
|
Gregory Sherman |
May 04, 2021 |
Jan 27, 2023 |
|
Page 102
code in middle of page |
'... answer the question "How many Fibonacci numbers below 5,000 are odd?" in multiple ways:'
fibonacci_naive() generates 1 as the first Fibonacci number and does not strictly test for "below", so should be:
while i < 5000:
if i % 2:
fibonacci_transform() also has the problem with "below", so should be:
if f >= 5000:
as does fibonacci_succinct(); the fix is:
first_5000 = takewhile(lambda x: x < 5000, fibonacci())
Note from the Author or Editor: Yes, your changes do indeed fix the various off-by-one mistakes in this snippet.
|
Gregory Sherman |
May 04, 2021 |
Jan 27, 2023 |
|
Page 104
Example 5-2, definition of function `read_fake_data` |
Given the description of Example 5-4, it seems like the function `read_fake_data` in Example 5-2 should contain a "!=" rather than a "==" in its definition. Specifically, in Example 5-4 the anomaly detection is set up to find values that don't fit a normal distribution, but as written `read_fake_data` primarily produces data with the constant value 100, which does not fit a normal distribution.
I think the function definition should read:
def read_fake_data(filename):
for timestamp in count():
# We insert an anomalous data point approximately once a week
if randint(0, 7*60*60*24 - 1) != 1:
value = normalvariate(0, 1)
else:
value = 100
yield datetime.fromtimestamp(timestamp), value
Note from the Author or Editor: Good catch! Indeed, the "==" should be a "!=" because we want the anomalous value of 100 to be infrequent. In fact, to make this more readable I think a better if statement would be:
if randint(0, 7 * 60 * 60 * 24 - 1) == 1:
value = 100
else:
value = normalvariate(0, 1)
|
Anonymous |
Oct 01, 2020 |
Jan 27, 2023 |
|
Page 106
Middle |
"continue retrieving anomalous data This is called"
Missing period.
|
Anonymous |
Aug 28, 2020 |
Jan 27, 2023 |
|
Page 106
Lazy Generator Evaluation |
Please update the example as shown in your github example here: github.com/mynameisfiber/high_performance_python_2e/blob/master/05_iterators/lazy_data_analysis.py
Because:
- data should be created using read_fake_data not read_data
Thank you
Note from the Author or Editor: Good eye. Example 5-5 should read:
data = read_fake_data("fake_filename")
instead of:
data = read_data(filename)
|
Ali |
Jan 24, 2021 |
Jan 27, 2023 |
|
Page 131
3rd paragraph |
"We will continue on the track of removing necessary functionality in favor of performance ..."
I believe it means to say "unnecessary" instead of "necessary," especially given the larger context. Thanks! :)
Note from the Author or Editor: Agreed, please replace "We will continue on the track of removing necessary functionality..." with "We will continue on the track of removing unnecessary functionality..."
|
Alex Dvorak |
Dec 07, 2022 |
Jan 27, 2023 |
|
Page 227
main function located with "5" circle |
for this function to work the correction should be:
result = asyncio.run(run_func())
current function is:
result = asyncio.run(run_func)
this returns the error: "run_func is not a coroutine"
Note from the Author or Editor: This is 100% correct. Good catch!
|
Harry Ritchie |
Jan 15, 2021 |
Jan 27, 2023 |
|
Page 254
1st paragraph |
We create a list containing nbr_estimates divided by the number of workers.
nbr_estimates -> nbr_samples_in_total ?
Note from the Author or Editor: The text currently reads
"We create a list containing `nbr_estimates` divided by the number of workers."
and it should read:
"We create a list containing `nbr_samples_in_total` divided by the number of workers."
|
Evan Lai |
Aug 26, 2020 |
Jan 27, 2023 |
|
Page 265
Figure 9-8 and text below |
"Using processes ... A second CPU doubles the speed, and using four CPUs quadruples the speed."
==============================================
This conflicts with the graph above, which shows times of about 2.5 for 1 worker, 1.6 for 2, and 0.9 for 4.
Note from the Author or Editor: The text currently reads "Using processes gives us a predictable speedup, just as it did in the pure Python example. A second CPU doubles the speed, and using four CPUs quadruples the speed." and should be replaced with:
"Using processes gives us a predictable speedup, just as it did in the pure Python example. A second CPU nearly doubles the speed, and when using four CPUs the speed is nearly quadrupled. We rarely achieve a pure doubling or quadrupling of execution speeds due to other overheads."
|
Gregory Sherman |
Aug 02, 2021 |
Jan 27, 2023 |