Machine Learning, 2nd Edition

Book description

Dig deep into the data with a hands-on guide to machine learning with updated examples and more!

Machine Learning: Hands-On for Developers and Technical Professionals provides hands-on instruction and fully-coded working examples for the most common machine learning techniques used by developers and technical professionals. The book contains a breakdown of each ML variant, explaining how it works and how it is used within certain industries, allowing readers to incorporate the presented techniques into their own work as they follow along. A core tenant of machine learning is a strong focus on data preparation, and a full exploration of the various types of learning algorithms illustrates how the proper tools can help any developer extract information and insights from existing data. The book includes a full complement of Instructor's Materials to facilitate use in the classroom, making this resource useful for students and as a professional reference.

At its core, machine learning is a mathematical, algorithm-based technology that forms the basis of historical data mining and modern big data science. Scientific analysis of big data requires a working knowledge of machine learning, which forms predictions based on known properties learned from training data. Machine Learning is an accessible, comprehensive guide for the non-mathematician, providing clear guidance that allows readers to:

  • Learn the languages of machine learning including Hadoop, Mahout, and Weka
  • Understand decision trees, Bayesian networks, and artificial neural networks
  • Implement Association Rule, Real Time, and Batch learning
  • Develop a strategic plan for safe, effective, and efficient machine learning

By learning to construct a system that can learn from data, readers can increase their utility across industries. Machine learning sits at the core of deep dive data analysis and visualization, which is increasingly in demand as companies discover the goldmine hiding in their existing data. For the tech professional involved in data science, Machine Learning: Hands-On for Developers and Technical Professionals provides the skills and techniques required to dig deeper.

Table of contents

  1. Cover
  2. Introduction
    1. Aims of This Book
    2. “Hands-On” Means Hands-On
    3. “What About the Math?”
    4. What Will You Have Learned by the End?
    5. Balancing Theory and Hands-on Learning
    6. Source Code for This Book
    7. Using Git
  3. CHAPTER 1: What Is Machine Learning?
    1. History of Machine Learning
    2. Algorithm Types for Machine Learning
    3. The Human Touch
    4. Uses for Machine Learning
    5. Languages for Machine Learning
    6. Software Used in This Book
    7. Data Repositories
    8. Summary
  4. CHAPTER 2: Planning for Machine Learning
    1. The Machine Learning Cycle
    2. It All Starts with a Question
    3. I Don't Have Data!
    4. One Solution Fits All?
    5. Defining the Process
    6. Building a Data Team
    7. Data Processing
    8. Data Storage
    9. Data Privacy
    10. Data Quality and Cleaning
    11. Thinking About Input Data
    12. Thinking About Output Data
    13. Don't Be Afraid to Experiment
    14. Summary
  5. CHAPTER 3: Data Acquisition Techniques
    1. Scraping Data
    2. Using an API
    3. Migrating Data
    4. Summary
  6. CHAPTER 4: Statistics, Linear Regression, and Randomness
    1. Working with a Basic Dataset
    2. Introducing Basic Statistics
    3. Using Simple Linear Regression
    4. Embracing Randomness
    5. Summary
  7. CHAPTER 5: Working with Decision Trees
    1. The Basics of Decision Trees
    2. Decision Trees in Weka
    3. Summary
  8. CHAPTER 6: Clustering
    1. What Is Clustering?
    2. Where Is Clustering Used?
    3. Clustering Models
    4. K-Means Clustering with Weka
    5. Summary
  9. CHAPTER 7: Association Rules Learning
    1. Where Is Association Rules Learning Used?
    2. How Association Rules Learning Works
    3. Algorithms
    4. Mining the Baskets—A Walk-Through
    5. Summary
  10. CHAPTER 8: Support Vector Machines
    1. What Is a Support Vector Machine?
    2. Where Are Support Vector Machines Used?
    3. The Basic Classification Principles
    4. How Support Vector Machines Approach Classification
    5. Using Support Vector Machines in Weka
    6. Summary
  11. CHAPTER 9: Artificial Neural Networks
    1. What Is a Neural Network?
    2. Artificial Neural Network Uses
    3. Trusting the Black Box
    4. Breaking Down the Artificial Neural Network
    5. Data Preparation for Artificial Neural Networks
    6. Artificial Neural Networks with Weka
    7. Implementing a Neural Network in Java
    8. Developing Neural Networks with DeepLearning4J
    9. Summary
  12. CHAPTER 10: Machine Learning with Text Documents
    1. Preparing Text for Analysis
    2. TF/IDF
    3. Word2Vec
    4. Basic Sentiment Analysis
    5. Summary
  13. CHAPTER 11: Machine Learning with Images
    1. What Is an Image?
    2. Basic Classification with Neural Networks
    3. Convolutional Neural Networks
    4. Transfer Learning
    5. Summary
  14. CHAPTER 12: Machine Learning Streaming with Kafka
    1. What You Will Learn in This Chapter
    2. From Machine Learning to Machine Learning Engineer
    3. From Batch Processing to Streaming Data Processing
    4. What Is Kafka?
    5. Installing Kafka
    6. Topics Management
    7. Kafka Tool UI
    8. Writing Your Own Producers and Consumers
    9. Building a Streaming Machine Learning System
    10. Kafka Topics
    11. Kafka Connect
    12. The REST API Microservice
    13. Processing Commands and Events
    14. Making Predictions
    15. Running the Project
    16. Summary
  15. CHAPTER 13: Apache Spark
    1. Spark: A Hadoop Replacement?
    2. Java, Scala, or Python?
    3. Downloading and Installing Spark
    4. A Quick Intro to Spark
    5. Comparing Hadoop MapReduce to Spark
    6. Writing Stand-Alone Programs with Spark
    7. Spark SQL
    8. Spark Streaming
    9. MLib: The Machine Learning Library
    10. Summary
  16. CHAPTER 14: Machine Learning with R
    1. Installing R
    2. Your First Run
    3. Installing R-Studio
    4. The R Basics
    5. Simple Statistics
    6. Simple Linear Regression
    7. Basic Sentiment Analysis
    8. Apriori Association Rules
    9. Accessing R from Java
    10. Summary
  17. APPENDIX A: Kafka Quick Start
    1. Installing Kafka
    2. Starting Zookeeper
    3. Starting Kafka
    4. Creating Topics
    5. Listing Topics
    6. Describing a Topic
    7. Deleting Topics
    8. Running a Console Producer
    9. Running a Console Consumer
  18. APPENDIX B: The Twitter API Developer Application Configuration
  19. APPENDIX C: Useful Unix Commands
    1. Using Sample Data
    2. Showing the Contents: cat, more, and less
    3. Filtering Content: grep
    4. Sorting Data: sort
    5. Finding Unique Occurrences: uniq
    6. Showing the Top of a File: head
    7. Counting Words: wc
    8. Locating Anything: find
    9. Combining Commands and Redirecting Output
    10. Picking a Text Editor
  20. APPENDIX D: Further Reading
    1. Machine Learning
    2. Statistics
    3. Big Data and Data Science
    4. Visualization
    5. Making Decisions
    6. Datasets
    7. Blogs
    8. Useful Websites
    9. The Tools of the Trade
  21. Index
  22. End User License Agreement

Product information

  • Title: Machine Learning, 2nd Edition
  • Author(s): Jason Bell
  • Release date: March 2020
  • Publisher(s): Wiley
  • ISBN: 9781119642145