Book description
A solution-based guide to put your deep learning models into production with the power of Apache Spark
Key Features- Discover practical recipes for distributed deep learning with Apache Spark
- Learn to use libraries such as Keras and TensorFlow
- Solve problems in order to train your deep learning models on Apache Spark
With deep learning gaining rapid mainstream adoption in modern-day industries, organizations are looking for ways to unite popular big data tools with highly efficient deep learning libraries. As a result, this will help deep learning models train with higher efficiency and speed.
With the help of the Apache Spark Deep Learning Cookbook, you'll work through specific recipes to generate outcomes for deep learning algorithms, without getting bogged down in theory. From setting up Apache Spark for deep learning to implementing types of neural net, this book tackles both common and not so common problems to perform deep learning on a distributed environment. In addition to this, you'll get access to deep learning code within Spark that can be reused to answer similar problems or tweaked to answer slightly different problems. You will also learn how to stream and cluster your data with Spark. Once you have got to grips with the basics, you'll explore how to implement and deploy deep learning models, such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in Spark, using popular libraries such as TensorFlow and Keras.
By the end of the book, you'll have the expertise to train and deploy efficient deep learning models on Apache Spark.
What you will learn- Set up a fully functional Spark environment
- Understand practical machine learning and deep learning concepts
- Apply built-in machine learning libraries within Spark
- Explore libraries that are compatible with TensorFlow and Keras
- Explore NLP models such as Word2vec and TF-IDF on Spark
- Organize dataframes for deep learning evaluation
- Apply testing and training modeling to ensure accuracy
- Access readily available code that may be reusable
If you're looking for a practical and highly useful resource for implementing efficiently distributed deep learning models with Apache Spark, then the Apache Spark Deep Learning Cookbook is for you. Knowledge of the core machine learning concepts and a basic understanding of the Apache Spark framework is required to get the best out of this book. Additionally, some programming knowledge in Python is a plus.
Table of contents
- Title Page
- Copyright and Credits
- Packt Upsell
- Foreword
- Contributors
- Preface
-
Setting Up Spark for Deep Learning Development
- Introduction
- Downloading an Ubuntu Desktop image
- Installing and configuring Ubuntu with VMWare Fusion on macOS
- Installing and configuring Ubuntu with Oracle VirtualBox on Windows
- Installing and configuring Ubuntu Desktop for Google Cloud Platform
- Installing and configuring Spark and prerequisites on Ubuntu Desktop
- Integrating Jupyter notebooks with Spark
- Starting and configuring a Spark cluster
- Stopping a Spark cluster
-
Creating a Neural Network in Spark
- Introduction
- Creating a dataframe in PySpark
- Manipulating columns in a PySpark dataframe
- Converting a PySpark dataframe to an array
- Visualizing an array in a scatterplot
- Setting up weights and biases for input into the neural network
- Normalizing the input data for the neural network
- Validating array for optimal neural network performance
- Setting up the activation function with sigmoid
- Creating the sigmoid derivative function
- Calculating the cost function in a neural network
- Predicting gender based on height and weight
- Visualizing prediction scores
- Pain Points of Convolutional Neural Networks
- Pain Points of Recurrent Neural Networks
-
Predicting Fire Department Calls with Spark ML
- Introduction
- Downloading the San Francisco fire department calls dataset
- Identifying the target variable of the logistic regression model
- Preparing feature variables for the logistic regression model
- Applying the logistic regression model
- Evaluating the accuracy of the logistic regression model
- Using LSTMs in Generative Networks
-
Natural Language Processing with TF-IDF
- Introduction
- Downloading the therapy bot session text dataset
- Analyzing the therapy bot session dataset
- Visualizing word counts in the dataset
- Calculating sentiment analysis of text
- Removing stop words from the text
- Training the TF-IDF model
- Evaluating TF-IDF model performance
- Comparing model performance to a baseline score
- Real Estate Value Prediction Using XGBoost
- Predicting Apple Stock Market Cost with LSTM
- Face Recognition Using Deep Convolutional Networks
- Creating and Visualizing Word Vectors Using Word2Vec
- Creating a Movie Recommendation Engine with Keras
-
Image Classification with TensorFlow on Spark
- Introduction
- Downloading 30 images each of Messi and Ronaldo
- Configuring PySpark installation with deep learning packages
- Loading images on to PySpark dataframes
- Understanding transfer learning
- Creating a pipeline for image classification training
- Evaluating model performance
- Fine-tuning model parameters
- Other Books You May Enjoy
Product information
- Title: Apache Spark Deep Learning Cookbook
- Author(s):
- Release date: July 2018
- Publisher(s): Packt Publishing
- ISBN: 9781788474221
You might also like
book
Apache Spark for Data Science Cookbook
Over insightful 90 recipes to get lightning-fast analytics with Apache Spark About This Book Use Apache …
book
Hands-On Deep Learning with Apache Spark
Speed up the design and implementation of deep learning solutions using Apache Spark Key Features Explore …
book
Apache Hadoop™ YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop™ 2
“This book is a critically needed resource for the newly released Apache Hadoop 2.0, highlighting YARN …
video
Hadoop and Spark Fundamentals
9+ Hours of Video Instruction The perfect (and fast) way to get started with Hadoop and …