The errata list is a list of errors and their corrections that were found after the product was released. If the error was corrected in a later version or reprint the date of the correction will be displayed in the column titled "Date Corrected".
The following errata were submitted by our customers and approved as valid errors by the author or editor.
Version |
Location |
Description |
Submitted By |
Date submitted |
Date corrected |
PDF |
Page 49
last paragraph |
"However, if our code was in a subfolder called data,"
Replace the word 'code' with 'data'.
Note from the Author or Editor: should read
However, if our code was in a subfolder called code
instead of
"However, if our code was in a subfolder called data,"
|
Ron B |
Feb 17, 2016 |
Jan 27, 2017 |
Printed |
Page 76
2nd paragraph |
Paragraph states:
"From this folder, type the following command in your terminal to run the script from the command line:
python parse_script.py"
I think it is meant to say:
"python parse_excel.py"
since that is what you called the new python file in the prior paragraph (step # 2):
"2. Create a new Python file called parse_excel.py and put in the folder you created. "
Note from the Author or Editor: Yes, it should say:
python parse_excel.py
|
Bryan P |
Mar 09, 2016 |
Jan 27, 2017 |
PDF |
Page 94
2nd paragraph |
"This code prints the first two lines of the file"
Replace 'lines' with 'pages'
Note from the Author or Editor: This code prints the first two lines of the file.
Please update to:
This code prints the first two pages of the file.
|
Ron B |
Feb 17, 2016 |
Jan 27, 2017 |
Printed, PDF |
Page 94
Last paragraph |
Missing sudo command in code:
pip install --upgrade -- ignoreinstalled slate==0.3 pdfminer==20110515
should read (at least for my Mac):
sudo pip install --upgrade -- ignoreinstalled slate==0.3 pdfminer==20110515
Note from the Author or Editor: If you are using a virtual environment, you can simply type:
pip install --upgrade -- ignoreinstalled slate==0.3 pdfminer==20110515
Otherwise, use:
sudo pip install --upgrade -- ignoreinstalled slate==0.3 pdfminer==20110515
|
zenzontle |
Mar 12, 2016 |
|
PDF |
Page 94
Warning Text Box |
In Warning box text, the pip install option "--ignoreinstalled" should be "--ignore-installed".
Windows 7
Python version 2.7.11
pip 8.1.1
Note from the Author or Editor: Yes, it should be
pip install .... --ignore-installed
|
Anonymous |
Mar 29, 2016 |
Jan 27, 2017 |
Printed |
Page 94
Code line no. 3 |
After installing slate and pdfminer using recommended method after ImportError, it still is not possible to process PDF in sample zip.
The error message is : PDFSyntaxError: No /Root object! - Is this really a PDF?
This was solved at : https://stackoverflow.com/questions/11384591/parsing-a-pdf-with-no-root-object-using-pdfminer/11438571 in the last solution on the page; the open statement should use options 'rb'.
So code line 3 reads:
with open(pdf, 'rb') as f:
Windows 7
python 2.7.11
slate 0.3
pdfminer 20110515
PDF Book
February 2016: First Edition
Revision History for the First Edition
2016-02-02 First Release
Note from the Author or Editor: Unsure if this is a windows-only issue, but regardless opening as 'rb' should be standard protocol, so let's change it:
with open(pdf, 'rb') as f:
|
Anonymous |
Mar 29, 2016 |
Jan 27, 2017 |
ePub |
Page 116
last paragraph |
Part of the paragraph reads:
"...can be done simply by running pip install pdftables and pip requests install."
Should read
"can be done simply by running pip install pdftables and pip install requests."
Note from the Author or Editor: As noted, please change: pip requests install
to
pip install requests
|
Deb R.H. |
May 08, 2016 |
Jan 27, 2017 |
Printed |
Page 301
2nd paragraph |
the URL https://enoughproject.org/take-action brings up https://enoughproject.org/get-involved/take-action with what appears to be a different HTML structure, so in following pages some of the code returns errors. For instance, on P. 301 the code in paragraphs 15 and 16 throws AttributeError: object has no attribute "'descendants"
Note from the Author or Editor: Hi there,
This is indeed the case! You can find old versions of the website pages and code in the code repository. https://github.com/jackiekazil/data-wrangling That should allow you to follow along with the book with old copies of the page. Unfortunately, as you probably know from reading the chapter, the web is constantly changing and this means scraping content is a never ending job. Hope this helps!
-katharine
|
John Roby |
Sep 06, 2017 |
|
Printed |
Page 399
second code snippet, second line of code, footnoted as '1' |
The line
`from emojispider.items import EmojispiderItem`
should read
`from scrapyspider.items import EmojiSpiderItem`,
as per the Github example: https://github.com/jackiekazil/data-wrangling/blob/master/code/chp12-scraping/scrapyspider/scrapyspider/spiders/emo_spider.py
Note from the Author or Editor: This is correct, we should change
`from emojispider.items import EmojispiderItem`
should read
`from scrapyspider.items import EmojiSpiderItem`,
|
Anonymous |
Jun 30, 2016 |
Jan 27, 2017 |
PDF |
Page 439
5th paragraph |
The output of the GCC compilers is machine code, NOT byte code. GCC does
not need to be installed to use the CPython interpreter or PyPy JIT to turn
python code into bytecode or machine code, respectively.
GCC would be needed to compile Cython code to machine code.
Cython is not used anywhere in this book.
Note from the Author or Editor: Please update this sentence:
"The purpose of GCC (the GNU Compiler Collection) is to take code written in
Python and turn it into something your machine can understand—byte code."
to the following
The purpose of GCC (the GNU Compiler Collection) is to take Python libraries with C extensions and turn it into something your machine can understand and execute.
|
Ron B |
Mar 02, 2016 |
Jan 27, 2017 |
PDF |
Page 445
6th paragraph |
Based on the instructions given above, ~/Projects has a single sub-directory called data_wrangling. It contains the 'code' subfolder, while 'envs' is in /home/_user's_name_.
Note from the Author or Editor: Please update the sentence:
At this point, if we look at the contents of our Projects folder, we should have two
empty subfolders called code and envs.
To read:
At this point, we have our code folder set up in a special file inside our Projects folder and our virtual environment folder properly set up in our home directory.
|
Ron B |
Mar 02, 2016 |
Jan 27, 2017 |