Book description
If you’re a business team leader, CIO, business analyst, or developer interested in how Apache Hadoop and Apache HBase-related technologies can address problems involving large-scale data in cost-effective ways, this book is for you. Using real-world stories and situations, authors Ted Dunning and Ellen Friedman show Hadoop newcomers and seasoned users alike how NoSQL databases and Hadoop can solve a variety of business and research issues.
You’ll learn about early decisions and pre-planning that can make the process easier and more productive. If you’re already using these technologies, you’ll discover ways to gain the full range of benefits possible with Hadoop. While you don’t need a deep technical background to get started, this book does provide expert guidance to help managers, architects, and practitioners succeed with their Hadoop projects.
- Examine a day in the life of big data: India’s ambitious Aadhaar project
- Review tools in the Hadoop ecosystem such as Apache’s Spark, Storm, and Drill to learn how they can help you
- Pick up a collection of technical and strategic tips that have helped others succeed with Hadoop
- Learn from several prototypical Hadoop use cases, based on how organizations have actually applied the technology
- Explore real-world stories that reveal how MapR customers combine use cases when putting Hadoop and NoSQL to work, including in production
Ted Dunning is Chief Applications Architect at MapR Technologies, and committer and PMC member of the Apache’s Drill, Storm, Mahout, and ZooKeeper projects. He is also mentor for Apache’s Datafu, Kylin, Zeppelin, Calcite, and Samoa projects.
Ellen Friedman is a solutions consultant, speaker, and author, writing mainly about big data topics. She is a committer for the Apache Mahout project and a contributor to the Apache Drill project.
Publisher resources
Table of contents
- Dedication
- Preface
- 1. Turning to Apache Hadoop and NoSQL Solutions
- 2. What the Hadoop Ecosystem Offers
- 3. Understanding the MapR Distribution for Apache Hadoop
-
4. Decisions That Drive Successful Hadoop Projects
- Tip #1: Pick One Thing to Do First
- Tip #2: Shift Your Thinking
- Tip #3: Start Conservatively But Plan to Expand
- Tip #4: Be Honest with Yourself
- Tip #5: Plan Ahead for Maintenance
- Tip #6: Think Big: Don’t Underestimate What You Can (and Will) Want to Do
- Tip #7: Explore New Data Formats
- Tip #8: Consider Data Placement When You Expand a Cluster
- Tip #9: Plot Your Expansion
- Tip #10: Form a Queue to the Right, Please
- Tip #11: Provide Reliable Primary Persistence When Using Search Tools
- Tip #12: Establish Remote Clusters for Disaster Recovery
- Tip #13: Take a Complete View of Performance
- Tip #14: Read Our Other Books (Really!)
- Tip # 15: Just Do It
- 5. Prototypical Hadoop Use Cases
- 6. Customer Stories
- 7. What’s Next?
- A. Additional Resources
- About the Authors
- Colophon
- Copyright
Product information
- Title: Real-World Hadoop
- Author(s):
- Release date: April 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491922668
You might also like
book
Hadoop Operations
If you’ve been asked to maintain large and complex Hadoop clusters, this book is a must. …
video
Hadoop and Spark Fundamentals
9+ Hours of Video Instruction The perfect (and fast) way to get started with Hadoop and …
book
Apache Hadoop 3 Quick Start Guide
A fast paced guide that will help you learn about Apache Hadoop 3 and its ecosystem …
book
Hadoop in Practice, Second Edition
Hadoop in Practice, Second Edition provides over 100 tested, instantly useful techniques that will help you …