Book description
Until recently, Hadoop deployments existed on hardware owned and run by organizations. Now, of course, you can acquire the computing resources and network connectivity to run Hadoop clusters in the cloud. But there’s a lot more to deploying Hadoop to the public cloud than simply renting machines.
This hands-on guide shows developers and systems administrators familiar with Hadoop how to install, use, and manage cloud-born clusters efficiently. You’ll learn how to architect clusters that work with cloud-provider features—not just to avoid pitfalls, but also to take full advantage of these services. You’ll also compare the Amazon, Google, and Microsoft clouds, and learn how to set up clusters in each of them.
- Learn how Hadoop clusters run in the cloud, the problems they can help you solve, and their potential drawbacks
- Examine the common concepts of cloud providers, including compute capabilities, networking and security, and storage
- Build a functional Hadoop cluster on cloud infrastructure, and learn what the major providers require
- Explore use cases for high availability, relational data with Hive, and complex analytics with Spark
- Get patterns and practices for running cloud clusters, from designing for price and security to dealing with maintenance
Publisher resources
Table of contents
- Foreword
- Preface
- I. Introduction to the Cloud
- 1. Why Hadoop in the Cloud?
- 2. Overview and Comparison of Cloud Providers
- II. Cloud Primer
- 3. Instances
- 4. Networking and Security
- 5. Storage
- III. A Simple Cluster in the Cloud
- 6. Setting Up in AWS
- 7. Setting Up in Google Cloud Platform
- 8. Setting Up in Azure
- 9. Standing Up a Cluster
- IV. Enhancing Your Cluster
- 10. High Availability
- 11. Relational Data with Apache Hive
- 12. Streaming in the Cloud with Apache Spark
- V. Care and Feeding of Hadoop in the Cloud
- 13. Pricing and Performance
- 14. Network Topologies
- 15. Patterns for Cluster Usage
- 16. Using Images for Cluster Management
- 17. Monitoring and Automation
- 18. Backup and Restoration
- A. Hadoop Component Start and Stop Scripts
- B. Hadoop Cluster Configuration Scripts
- C. Monitoring Cloud Clusters with Nagios
- Index
Product information
- Title: Moving Hadoop to the Cloud
- Author(s):
- Release date: July 2017
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491959589
You might also like
book
Practical Hadoop Migration: How to Integrate Your RDBMS with the Hadoop Ecosystem and Re-Architect Relational Applications to NoSQL
Re-architect relational applications to NoSQL, integrate relational database management systems with the Hadoop ecosystem, and transform …
book
Hadoop Security
As more corporations turn to Hadoop to store and process their most valuable data, the risk …
book
Hadoop 2.x Administration Cookbook
Over 100 practical recipes to help you become an expert Hadoop administrator About This Book Become …
book
Getting Started with Kudu
Fast data ingestion, serving, and analytics in the Hadoop ecosystem have forced developers and architects to …