Real-World Natural Language Processing

Book description

In Real-world Natural Language Processing you will learn how to:

  • Design, develop, and deploy useful NLP applications
  • Create named entity taggers
  • Build machine translation systems
  • Construct language generation systems and chatbots
  • Use advanced NLP concepts such as attention and transfer learning

Real-world Natural Language Processing teaches you how to create practical NLP applications without getting bogged down in complex language theory and the mathematics of deep learning. In this engaging book, you’ll explore the core tools and techniques required to build a huge range of powerful NLP apps, including chatbots, language detectors, and text classifiers.

About the Technology
Training computers to interpret and generate speech and text is a monumental challenge, and the payoff for reducing labor and improving human/computer interaction is huge! Th e field of Natural Language Processing (NLP) is advancing rapidly, with countless new tools and practices. This unique book offers an innovative collection of NLP techniques with applications in machine translation, voice assistants, text generation, and more.

About the Book
Real-world Natural Language Processing shows you how to build the practical NLP applications that are transforming the way humans and computers work together. Guided by clear explanations of each core NLP topic, you’ll create many interesting applications including a sentiment analyzer and a chatbot. Along the way, you’ll use Python and open source libraries like AllenNLP and HuggingFace Transformers to speed up your development process.

What's Inside
  • Design, develop, and deploy useful NLP applications
  • Create named entity taggers
  • Build machine translation systems
  • Construct language generation systems and chatbots


About the Reader
For Python programmers. No prior machine learning knowledge assumed.

About the Author
Masato Hagiwara received his computer science PhD from Nagoya University in 2009. He has interned at Google and Microsoft Research, and worked at Duolingo as a Senior Machine Learning Engineer. He now runs his own research and consulting company.

Quotes
The definitive reference for those of us trying to understand NLP and its applications.
- Richard Vaughan, Purple Monkey Collective

Very practical book about NLP and how to use it successfully in real-world applications.
- Salvatore Campagna, King

If you need to step up your game but were turned off by a difficult learning curve, then this book is for you!
- Alain Lompo, ISO-Gruppe

An excellent and approachable first step in learning NLP. Well written and easy to follow.
- Marc-Anthony Taylor, Blackshark.ai

Table of contents

  1. Real-World Natural Language Processing
  2. inside front cover
  3. Copyright
  4. dedication
  5. contents
    1. index
  6. front matter
    1. preface
    2. acknowledgments
    3. about this book
      1. Who should read this book
      2. How this book is organized: A roadmap
      3. About the code
      4. liveBook discussion forum
      5. Other online resources
    4. about the author
    5. about the cover illustration
  7. Part 1 Basics
  8. 1 Introduction to natural language processing
    1. 1.1 What is natural language processing (NLP)?
      1. 1.1.1 What is NLP?
      2. 1.1.2 What is not NLP?
      3. 1.1.3 AI, ML, DL, and NLP
      4. 1.1.4 Why NLP?
    2. 1.2 How NLP is used
      1. 1.2.1 NLP applications
      2. 1.2.2 NLP tasks
    3. 1.3 Building NLP applications
      1. 1.3.1 Development of NLP applications
      2. 1.3.2 Structure of NLP applications
    4. Summary
  9. 2 Your first NLP application
    1. 2.1 Introducing sentiment analysis
    2. 2.2 Working with NLP datasets
      1. 2.2.1 What is a dataset?
      2. 2.2.2 Stanford Sentiment Treebank
      3. 2.2.3 Train, validation, and test sets
      4. 2.2.4 Loading SST datasets using AllenNLP
    3. 2.3 Using word embeddings
      1. 2.3.1 What are word embeddings?
      2. 2.3.2 Using word embeddings for sentiment analysis
    4. 2.4 Neural networks
      1. 2.4.1 What are neural networks?
      2. 2.4.2 Recurrent neural networks (RNNs) and linear layers
      3. 2.4.3 Architecture for sentiment analysis
    5. 2.5 Loss functions and optimization
    6. 2.6 Training your own classifier
      1. 2.6.1 Batching
      2. 2.6.2 Putting everything together
    7. 2.7 Evaluating your classifier
    8. 2.8 Deploying your application
      1. 2.8.1 Making predictions
      2. 2.8.2 Serving predictions
    9. Summary
  10. 3 Word and document embeddings
    1. 3.1 Introducing embeddings
      1. 3.1.1 What are embeddings?
      2. 3.1.2 Why are embeddings important?
    2. 3.2 Building blocks of language: Characters, words, and phrases
      1. 3.2.1 Characters
      2. 3.2.2 Words, tokens, morphemes, and phrases
      3. 3.2.3 N-grams
    3. 3.3 Tokenization, stemming, and lemmatization
      1. 3.3.1 Tokenization
      2. 3.3.2 Stemming
      3. 3.3.3 Lemmatization
    4. 3.4 Skip-gram and continuous bag of words (CBOW)
      1. 3.4.1 Where word embeddings come from
      2. 3.4.2 Using word associations
      3. 3.4.3 Linear layers
      4. 3.4.4 Softmax
      5. 3.4.5 Implementing Skip-gram on AllenNLP
      6. 3.4.6 Continuous bag of words (CBOW) model
    5. 3.5 GloVe
      1. 3.5.1 How GloVe learns word embeddings
      2. 3.5.2 Using pretrained GloVe vectors
    6. 3.6 fastText
      1. 3.6.1 Making use of subword information
      2. 3.6.2 Using the fastText toolkit
    7. 3.7 Document-level embeddings
    8. 3.8 Visualizing embeddings
    9. Summary
  11. 4 Sentence classification
    1. 4.1 Recurrent neural networks (RNNs)
      1. 4.1.1 Handling variable-length input
      2. 4.1.2 RNN abstraction
      3. 4.1.3 Simple RNNs and nonlinearity
    2. 4.2 Long short-term memory units (LSTMs) and gated recurrent units (GRUs)
      1. 4.2.1 Vanishing gradients problem
      2. 4.2.2 Long short-term memory (LSTM)
      3. 4.2.3 Gated recurrent units (GRUs)
    3. 4.3 Accuracy, precision, recall, and F-measure
      1. 4.3.1 Accuracy
      2. 4.3.2 Precision and recall
      3. 4.3.3 F-measure
    4. 4.4 Building AllenNLP training pipelines
      1. 4.4.1 Instances and fields
      2. 4.4.2 Vocabulary and token indexers
      3. 4.4.3 Token embedders and RNNs
      4. 4.4.4 Building your own model
      5. 4.4.5 Putting it all together
    5. 4.5 Configuring AllenNLP training pipelines
    6. 4.6 Case study: Language detection
      1. 4.6.1 Using characters as input
      2. 4.6.2 Creating a dataset reader
      3. 4.6.3 Building the training pipeline
      4. 4.6.4 Running the detector on unseen instances
    7. Summary
  12. 5 Sequential labeling and language modeling
    1. 5.1 Introducing sequential labeling
      1. 5.1.1 What is sequential labeling?
      2. 5.1.2 Using RNNs to encode sequences
      3. 5.1.3 Implementing a Seq2Seq encoder in AllenNLP
    2. 5.2 Building a part-of-speech tagger
      1. 5.2.1 Reading a dataset
      2. 5.2.2 Defining the model and the loss
      3. 5.2.3 Building the training pipeline
    3. 5.3 Multilayer and bidirectional RNNs
      1. 5.3.1 Multilayer RNNs
      2. 5.3.2 Bidirectional RNNs
    4. 5.4 Named entity recognition
      1. 5.4.1 What is named entity recognition?
      2. 5.4.2 Tagging spans
      3. 5.4.3 Implementing a named entity recognizer
    5. 5.5 Modeling a language
      1. 5.5.1 What is a language model?
      2. 5.5.2 Why are language models useful?
      3. 5.5.3 Training an RNN language model
    6. 5.6 Text generation using RNNs
      1. 5.6.1 Feeding characters to an RNN
      2. 5.6.2 Evaluating text using a language model
      3. 5.6.3 Generating text using a language model
    7. Summary
  13. Part 2 Advanced models
  14. 6 Sequence-to-sequence models
    1. 6.1 Introducing sequence-to-sequence models
    2. 6.2 Machine translation 101
    3. 6.3 Building your first translator
      1. 6.3.1 Preparing the datasets
      2. 6.3.2 Training the model
      3. 6.3.3 Running the translator
    4. 6.4 How Seq2Seq models work
      1. 6.4.1 Encoder
      2. 6.4.2 Decoder
      3. 6.4.3 Greedy decoding
      4. 6.4.4 Beam search decoding
    5. 6.5 Evaluating translation systems
      1. 6.5.1 Human evaluation
      2. 6.5.2 Automatic evaluation
    6. 6.6 Case study: Building a chatbot
      1. 6.6.1 Introducing dialogue systems
      2. 6.6.2 Preparing a dataset
      3. 6.6.3 Training and running a chatbot
      4. 6.6.4 Next steps
    7. Summary
  15. 7 Convolutional neural networks
    1. 7.1 Introducing convolutional neural networks (CNNs)
      1. 7.1.1 RNNs and their shortcomings
      2. 7.1.2 Pattern matching for sentence classification
      3. 7.1.3 Convolutional neural networks (CNNs)
    2. 7.2 Convolutional layers
      1. 7.2.1 Pattern matching using filters
      2. 7.2.2 Rectified linear unit (ReLU)
      3. 7.2.3 Combining scores
    3. 7.3 Pooling layers
    4. 7.4 Case study: Text classification
      1. 7.4.1 Review: Text classification
      2. 7.4.2 Using CnnEncoder
      3. 7.4.3 Training and running the classifier
    5. Summary
  16. 8 Attention and Transformer
    1. 8.1 What is attention?
      1. 8.1.1 Limitation of vanilla Seq2Seq models
      2. 8.1.2 Attention mechanism
    2. 8.2 Sequence-to-sequence with attention
      1. 8.2.1 Encoder-decoder attention
      2. 8.2.2 Building a Seq2Seq machine translation with attention
    3. 8.3 Transformer and self-attention
      1. 8.3.1 Self-attention
      2. 8.3.2 Transformer
      3. 8.3.3 Experiments
    4. 8.4 Transformer-based language models
      1. 8.4.1 Transformer as a language model
      2. 8.4.2 Transformer-XL
      3. 8.4.3 GPT-2
      4. 8.4.4 XLM
    5. 8.5 Case study: Spell-checker
      1. 8.5.1 Spell correction as machine translation
      2. 8.5.2 Training a spell-checker
      3. 8.5.3 Improving a spell-checker
    6. Summary
  17. 9 Transfer learning with pretrained language models
    1. 9.1 Transfer learning
      1. 9.1.1 Traditional machine learning
      2. 9.1.2 Word embeddings
      3. 9.1.3 What is transfer learning?
    2. 9.2 BERT
      1. 9.2.1 Limitations of word embeddings
      2. 9.2.2 Self-supervised learning
      3. 9.2.3 Pretraining BERT
      4. 9.2.4 Adapting BERT
    3. 9.3 Case study 1: Sentiment analysis with BERT
      1. 9.3.1 Tokenizing input
      2. 9.3.2 Building the model
      3. 9.3.3 Training the model
    4. 9.4 Other pretrained language models
      1. 9.4.1 ELMo
      2. 9.4.2 XLNet
      3. 9.4.3 RoBERTa
      4. 9.4.4 DistilBERT
      5. 9.4.5 ALBERT
    5. 9.5 Case study 2: Natural language inference with BERT
      1. 9.5.1 What is natural language inference?
      2. 9.5.2 Using BERT for sentence-pair classification
      3. 9.5.3 Using Transformers with AllenNLP
    6. Summary
  18. Part 3 Putting into production
  19. 10 Best practices in developing NLP applications
    1. 10.1 Batching instances
      1. 10.1.1 Padding
      2. 10.1.2 Sorting
      3. 10.1.3 Masking
    2. 10.2 Tokenization for neural models
      1. 10.2.1 Unknown words
      2. 10.2.2 Character models
      3. 10.2.3 Subword models
    3. 10.3 Avoiding overfitting
      1. 10.3.1 Regularization
      2. 10.3.2 Early stopping
      3. 10.3.3 Cross-validation
    4. 10.4 Dealing with imbalanced datasets
      1. 10.4.1 Using appropriate evaluation metrics
      2. 10.4.2 Upsampling and downsampling
      3. 10.4.3 Weighting losses
    5. 10.5 Hyperparameter tuning
      1. 10.5.1 Examples of hyperparameters
      2. 10.5.2 Grid search vs. random search
      3. 10.5.3 Hyperparameter tuning with Optuna
    6. Summary
  20. 11 Deploying and serving NLP applications
    1. 11.1 Architecting your NLP application
      1. 11.1.1 Before machine learning
      2. 11.1.2 Choosing the right architecture
      3. 11.1.3 Project structure
      4. 11.1.4 Version control
    2. 11.2 Deploying your NLP model
      1. 11.2.1 Testing
      2. 11.2.2 Train-serve skew
      3. 11.2.3 Monitoring
      4. 11.2.4 Using GPUs
    3. 11.3 Case study: Serving and deploying NLP applications
      1. 11.3.1 Serving models with TorchServe
      2. 11.3.2 Deploying models with SageMaker
    4. 11.4 Interpreting and visualizing model predictions
    5. 11.5 Where to go from here
    6. Summary
  21. index

Product information

  • Title: Real-World Natural Language Processing
  • Author(s): Masato Hagiwara
  • Release date: November 2021
  • Publisher(s): Manning Publications
  • ISBN: 9781617296420