Book description
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP.
Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way.
You'll learn how to:
- Employ best practices in building highly scalable data and ML pipelines on Google Cloud
- Automate and schedule data ingest using Cloud Run
- Create and populate a dashboard in Data Studio
- Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery
- Conduct interactive data exploration with BigQuery
- Create a Bayesian model with Spark on Cloud Dataproc
- Forecast time series and do anomaly detection with BigQuery ML
- Aggregate within time windows with Dataflow
- Train explainable machine learning models with Vertex AI
- Operationalize ML with Vertex AI Pipelines
Publisher resources
Table of contents
- Preface
- 1. Making Better Decisions Based on Data
- 2. Ingesting Data into the Cloud
- 3. Creating Compelling Dashboards
- 4. Streaming Data: Publication and Ingest with Pub/Sub and Dataflow
- 5. Interactive Data Exploration with Vertex AI Workbench
- 6. Bayesian Classifier with Apache Spark on Cloud Dataproc
- 7. Logistic Regression Using Spark ML
- 8. Machine Learning with BigQuery ML
- 9. Machine Learning with TensorFlow in Vertex AI
- 10. Getting Ready for MLOps with Vertex AI
- 11. Time-Windowed Features for Real-Time Machine Learning
- 12. The Full Dataset
- Conclusion
- A. Considerations for Sensitive Data Within Machine Learning Datasets
- Index
- About the Author
Product information
- Title: Data Science on the Google Cloud Platform, 2nd Edition
- Author(s):
- Release date: March 2022
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098118952
You might also like
book
Data Engineering with Google Cloud Platform
Build and deploy your own data pipelines on GCP, make key architectural decisions, and gain the …
book
Data Science on AWS
With this practical book, AI and machine learning practitioners will learn how to successfully build and …
book
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
book
Visualizing Google Cloud
Easy-to-follow visual walkthrough of every important part of the Google Cloud Platform The Google Cloud Platform …