Book description
Get expert guidance on architecting end-to-end data management solutions with Apache Hadoop. While many sources explain how to use various components in the Hadoop ecosystem, this practical book takes you through architectural considerations necessary to tie those components together into a complete tailored application, based on your particular use case.
Publisher resources
Table of contents
- Foreword
- Preface
- I. Architectural Considerations for Hadoop Applications
- 1. Data Modeling in Hadoop
- 2. Data Movement
- 3. Processing Data in Hadoop
- 4. Common Hadoop Processing Patterns
- 5. Graph Processing on Hadoop
-
6. Orchestration
- Why We Need Workflow Orchestration
- The Limits of Scripting
- The Enterprise Job Scheduler and Hadoop
- Orchestration Frameworks in the Hadoop Ecosystem
- Oozie Terminology
- Oozie Overview
- Oozie Workflow
- Workflow Patterns
- Parameterizing Workflows
- Classpath Definition
- Scheduling Patterns
- Executing Workflows
- Conclusion
- 7. Near-Real-Time Processing with Hadoop
- II. Case Studies
- 8. Clickstream Analysis
-
9. Fraud Detection
- Continuous Improvement
- Taking Action
- Architectural Requirements of Fraud Detection Systems
- Introducing Our Use Case
- High-Level Design
- Client Architecture
- Profile Storage and Retrieval
- Ingest
- Near-Real-Time and Exploratory Analytics
- Near-Real-Time Processing
- Exploratory Analytics
- What About Other Architectures?
- Conclusion
- 10. Data Warehouse
- A. Joins in Impala
- Index
Product information
- Title: Hadoop Application Architectures
- Author(s):
- Release date: July 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491900086
You might also like
book
Architecting HBase Applications
HBase is a remarkable tool for indexing mass volumes of data, but getting started with this …
video
Hadoop and Spark Fundamentals
9+ Hours of Video Instruction The perfect (and fast) way to get started with Hadoop and …
book
Apache Hadoop 3 Quick Start Guide
A fast paced guide that will help you learn about Apache Hadoop 3 and its ecosystem …
book
Hadoop Operations
If you’ve been asked to maintain large and complex Hadoop clusters, this book is a must. …