Book description
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build on top of the Google Cloud Platform (GCP). This hands-on guide shows developers entering the data science field how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP. Through the course of the book, you’ll work through a sample business decision by employing a variety of data science approaches.
Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.
You’ll learn how to:
- Automate and schedule data ingest, using an App Engine application
- Create and populate a dashboard in Google Data Studio
- Build a real-time analysis pipeline to carry out streaming analytics
- Conduct interactive data exploration with Google BigQuery
- Create a Bayesian model on a Cloud Dataproc cluster
- Build a logistic regression machine-learning model with Spark
- Compute time-aggregate features with a Cloud Dataflow pipeline
- Create a high-performing prediction model with TensorFlow
- Use your deployed model as a microservice you can access from both batch and real-time pipelines
Publisher resources
Table of contents
- Preface
- 1. Making Better Decisions Based on Data
- 2. Ingesting Data into the Cloud
-
3. Creating Compelling Dashboards
- Explain Your Model with Dashboards
- Why Build a Dashboard First?
- Accuracy, Honesty, and Good Design
- Loading Data into Google Cloud SQL
- Create a Google Cloud SQL Instance
- Interacting with Google Cloud Platform
- Controlling Access to MySQL
- Create Tables
- Populating Tables
- Building Our First Model
- Building a Dashboard
- Getting Started with Data Studio
- Summary
- 4. Streaming Data: Publication and Ingest
- 5. Interactive Data Exploration
- 6. Bayes Classifier on Cloud Dataproc
- 7. Machine Learning: Logistic Regression in Spark and BigQuery
- 8. Time-Windowed Aggregate Features
- 9. Machine Learning Classifier Using TensorFlow
- 10. Real-Time Machine Learning
- A. Considerations for Sensitive Data within Machine Learning Datasets
- Index
Product information
- Title: Data Science on the Google Cloud Platform
- Author(s):
- Release date: December 2017
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491974513
You might also like
book
Machine Learning with BigQuery ML
Manage different business scenarios with the right machine learning technique using Google's highly scalable BigQuery ML …
book
Google Cloud Platform Cookbook
Practical recipes to implement cost-effective and scalable cloud solutions for your organization About This Book Implement …
book
Building Serverless Applications with Google Cloud Run
Learn how to build a real-world serverless application in the cloud that's reliable, secure, maintainable, and …
book
Google BigQuery Analytics
How to effectively use BigQuery, avoid common mistakes, and execute sophisticated queries against large datasets Google …