Book description
Many companies are busy collecting massive amounts of data, but few are taking advantage of this treasure horde to build a truly data insights-driven organization. To do so, the data team must democratize both data and the insights in a way that provides real-time access to all employees in the organization. This report explores DataOps, the process, culture, tools, and people required to scale big data pervasively across the enterprise.
Just as DevOps has enabled organizations to improve coordination between developers and the operations team, DataOps closely connects everyone who handles data, including engineers, data scientists, analysts, and business users. Democratizing data with this approach requires removing barriers typical of siloed data, teams, and systems.
In this report, Apache Hive creators Ashish Thusoo and Joydeep Sen Sarma examine the characteristics of a data-driven organization that supports a self-service model.
- Explore related topics such as data lakes, metadata, cloud architecture, and data-infrastructure-as-a-service
- Examine conclusions from a survey of more than 400 senior executives whose companies are in various stages of data maturity
- Learn how data pioneers at Facebook, Uber, LinkedIn, Twitter, and eBay created data-driven cultures and self-service data infrastructures for their organizations
Table of contents
- Acknowledgments
- I. Foundations of a Data-Driven Enterprise
- 1. Introduction
- 2. Data and Data Infrastructure
- 3. Data Warehouses Versus Data Lakes: A Primer
- 4. Building a Data-Driven Organization
-
5. Putting Together the Infrastructure to Make Data Self-Service
- Technology That Supports the Self-Service Model
- Tools Used by Producers and Consumers of Data
- The Importance of a Complete and Integrated Data Infrastructure
- The Importance of Resource Sharing in a Self-Service World
- Security and Governance
- Self Help Support for Users
- Monitoring Resources and Chargebacks
- The “Big Compute Crunch”: How Facebook Allocates Data Infrastructure Resources
- Using the Cloud to Make Data Self Service
- Summary
- 6. Cloud Architecture and Data Infrastructure-as-a-Service
- 7. Metadata and Big Data
- 8. A Maturity-Model “Reality Check” for Organizations
- II. Case Studies
- 9. LinkedIn: The Road to Data Craftsmanship
- 10. Uber: Driven to Democratize Data
- 11. Twitter: When Everything Happens in Real Time
- 12. Capture All Data, Decide What to Do with It Later: My Experience at eBay
- A. A Podcast Interview Transcript
Product information
- Title: Creating a Data-Driven Enterprise with DataOps
- Author(s):
- Release date: March 2017
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491977835
You might also like
book
Implementing MLOps in the Enterprise
With demand for scaling, real-time access, and other capabilities, businesses need to consider building operational machine …
book
Terraform: Up and Running, 3rd Edition
Terraform has become a key player in the DevOps world for defining, launching, and managing infrastructure …
book
System Design on AWS
Enterprises building complex and large-scale applications in the cloud face multiple challenges. From figuring out the …
book
Building Microservices, 2nd Edition
As organizations shift from monolithic applications to smaller, self-contained microservices, distributed systems have become more fine-grained. …