Errata

Natural Language Processing with Spark NLP

Errata for Natural Language Processing with Spark NLP

Submit your own errata for this product.

The errata list is a list of errors and their corrections that were found after the product was released.

The following errata were submitted by our customers and have not yet been approved or disproved by the author or editor. They solely represent the opinion of the customer.

Color Key: Serious technical mistake Minor technical mistake Language or formatting error Typo Question Note Update

Version Location Description Submitted by Date submitted
chapter 4
Tokenizer

https://github.com/JohnSnowLabs/spark-nlp/issues/881

I think that the tokenizer needs to be fit before we can run transform in it. Just confirmed this from the spark-nlp team and the link for the same is mentioned above.

Gourav  Apr 29, 2020 
1
Bullet Point: Object Character Recognition (OCR)

I believe OCR is Optical Character Recognition. Even if you search Google for Object Character Recognition it only brings back results for Optical Character Recognition.

https://www.google.com/search?q=ocr+object+character+recognition

Alex Birch  Sep 14, 2019 
1
Checking out code

In the Checking out code section, the link provided is https://github.com/alexander-n-thomas/spark-nlp-book.git which returns a 404 error.

Asra Yousuf  Apr 25, 2020 
1
Checking Out the Code

Git repo https://github.com/alexander-n-thomas/spark-nlp-book.git returns 404

SUNIL KUMAR CHAKRAPANI  Jul 20, 2020