Errata

Errata for Natural Language Processing with Spark NLP

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Location	Description	Submitted by	Date submitted
chapter 4 Tokenizer	https://github.com/JohnSnowLabs/spark-nlp/issues/881 I think that the tokenizer needs to be fit before we can run transform in it. Just confirmed this from the spark-nlp team and the link for the same is mentioned above.	Gourav	Apr 29, 2020
1 Bullet Point: Object Character Recognition (OCR)	I believe OCR is Optical Character Recognition. Even if you search Google for Object Character Recognition it only brings back results for Optical Character Recognition. https://www.google.com/search?q=ocr+object+character+recognition	Alex Birch	Sep 14, 2019
1 Checking out code	In the Checking out code section, the link provided is https://github.com/alexander-n-thomas/spark-nlp-book.git which returns a 404 error.	Asra Yousuf	Apr 25, 2020
1 Checking Out the Code	Git repo https://github.com/alexander-n-thomas/spark-nlp-book.git returns 404	SUNIL KUMAR CHAKRAPANI	Jul 20, 2020