Natural Language Processing (NLP) for Everyone

Published by Pearson

Beginner to intermediate

The rise of online social platforms has resulted in an explosion of written text in the form of blogs, posts, tweets, wiki pages, and more. This new wealth of data provides a unique opportunity to explore natural language in its many forms, both as a way of automatically extracting information from written text and as a way of artificially producing text that looks natural.

In this class we introduce viewers to natural language processing from scratch. Each concept is introduced and explained through coding examples using nothing more than just plain Python and numpy. In this way, attendees learn in depth about the underlying concepts and techniques instead of just learning how to use a specific NLP library.

What you’ll learn and how you can apply it

Text representation
Topic modeling
Sentiment analysis
Language detection
Text classification
Document clustering

This live event is for you because...

You're a data scientist who is interested in mastering the concepts and ideas behind natural language processing.
You have no previous experience in NLP and want to take the first grounded steps
You have previous experience in using NLP libraries such as NLTK or Spacy and wish to get a greater understanding of what's going on “under the hood."

Prerequisites

Attendees should understand basic Python

Course Set-up:

Python - available here: https://www.python.org/
Course GitHub repo - https://github.com/DataForScience/NLP

Recommended Preparation:

Python Programming Language (video)
Modern Python LiveLessons: Big Ideas and Little Code in Python (video)
(video) Python Programming Language LiveLessons by David Beazley:
(video) Modern Python LiveLessons: Big Ideas and Little Code in Python by Ramond Hettinger

Recommended Follow-up:

(video) Natural Language Processing LiveLessons by Bruno Goncalves
Stay connected with Bruno and up-to-date on the world of data, science, and machine learning at https://data4sci.com/newsletter

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Segment 1 Text Representation (50m)

Represent words and numbers
Use One-Hot Encoding
Implement Bag of Words
Apply stopwords
Understand TF/IDF
Understand Stemming
Break 10m

Segment 2 Topic Modeling (60m)

Find topics in documents
Perform Explicit Semantic Analysis
Understand Document clustering
Implement Latent Semantic Analysis
Implement Non-negative Matrix factorization

Segment 3 Sentiment Analysis (40m)

Quantify words and feelings
Use Negations and modifiers
Understand corpus based approaches
Break 10m

Segment 4 Applications(70m)

Understand Word2vec word embeddings
Define GloVe
Apply Language detection

Your Instructor

Bruno Gonçalves
Bruno Gonçalves is currently a Head of Data Science working at the intersection of AI, Blockchain Technologies, and Finance. Previously, he was a Data Science Fellow at NYU's Center for Data Science while on leave from a tenured faculty position at Aix-Marseille Université. Since the completion of his PhD in the Physics of Complex Systems in 2008, he has pursued the use of Data Science and Machine Learning to the large-scale study of human behavior. In 2015, he was awarded the Complex Systems Society's Junior Scientific Award for "outstanding contributions in Complex Systems Science," and in 2018 he was named a Science Fellow of the Institute for Scientific Interchange in Turin, Italy.

linkedin link search