Book description
Master text-taming techniques and build effective text-processing applications with R
About This Book
Develop all the relevant skills for building text-mining apps with R with this easy-to-follow guide
Gain in-depth understanding of the text mining process with lucid implementation in the R language
Example-rich guide that lets you gain high-quality information from text data
Who This Book Is For
If you are an R programmer, analyst, or data scientist who wants to gain experience in performing text data mining and analytics with R, then this book is for you. Exposure to working with statistical methods and language processing would be helpful.
What You Will Learn
Get acquainted with some of the highly efficient R packages such as OpenNLP and RWeka to perform various steps in the text mining process
Access and manipulate data from different sources such as JSON and HTTP
Process text using regular expressions
Get to know the different approaches of tagging texts, such as POS tagging, to get started with text analysis
Explore different dimensionality reduction techniques, such as Principal Component Analysis (PCA), and understand its implementation in R
Discover the underlying themes or topics that are present in an unstructured collection of documents, using common topic models such as Latent Dirichlet Allocation (LDA)
Build a baseline sentence completing application
Perform entity extraction and named entity recognition using R
In Detail
Text Mining (or text data mining or text analytics) is the process of extracting useful and high-quality information from text by devising patterns and trends. R provides an extensive ecosystem to mine text through its many frameworks and packages.
Starting with basic information about the statistics concepts used in text mining, this book will teach you how to access, cleanse, and process text using the R language and will equip you with the tools and the associated knowledge about different tagging, chunking, and entailment approaches and their usage in natural language processing. Moving on, this book will teach you different dimensionality reduction techniques and their implementation in R. Next, we will cover pattern recognition in text data utilizing classification mechanisms, perform entity recognition, and develop an ontology learning framework.
By the end of the book, you will develop a practical application from the concepts learned, and will understand how text mining can be leveraged to analyze the massively available data on social media.
Style and approach
This book takes a hands-on, example-driven approach to the text mining process with lucid implementation in R.
Table of contents
-
Mastering Text Mining with R
- Table of Contents
- Mastering Text Mining with R
- Credits
- About the Authors
- About the Reviewers
- www.PacktPub.com
- Customer Feedback
- Preface
-
1. Statistical Linguistics with R
-
Probability theory and basic statistics
- Probability space and event
- Theorem of compound probabilities
- Conditional probability
- Bayes' formula for conditional probability
- Independent events
- Random variables
- Discrete random variables
- Probability frequency function
- Probability distributions using R
- Cumulative distribution function
- Joint distribution
- Binomial distribution
- Poisson distribution
- Counting occurrences
- Zipf's law
- Heaps' law
- Lexical richness
- Language models
- Quantitative methods in linguistics
- R packages for text mining
- Summary
-
Probability theory and basic statistics
- 2. Processing Text
- 3. Categorizing and Tagging Text
- 4. Dimensionality Reduction
- 5. Text Summarization and Clustering
-
6. Text Classification
- Text classification
- Document representation
- Kernel methods
- Bias–variance trade-off and learning curve
- Learning curve
- Dealing with reducible error components
- Summary
- 7. Entity Recognition
- Index
Product information
- Title: Mastering Text Mining with R
- Author(s):
- Release date: December 2016
- Publisher(s): Packt Publishing
- ISBN: 9781783551811
You might also like
book
Text Mining with R
Much of the data available today is unstructured and text-heavy, making it challenging for analysts to …
book
Machine Learning with R - Second Edition
Discover how to build machine learning algorithms, prepare data, and dig deep into data prediction techniques …
book
Deep Learning with R
Deep Learning with R introduces the world of deep learning using the powerful Keras library and …
book
Hands-On Data Science with R
A hands-on guide for professionals to perform various data science tasks in R Key Features Explore …