Chapter 3. Managing Vast Amounts of Data: The Read-Only Data Stores Architecture

This chapter will cover the RDS Architecture: the architecture positioned for intensive reads. We’ll examine the pattern of Command and Query Responsibility Segregation (CQRS) and add more principles to our overall architecture. We’ll also look at read-only data stores (RDSs): what they are, how they can be engineered, and what capabilities are typically needed, as well as the role of metadata. By the end of this chapter, you will have a good understanding of how RDSs can help make vast amounts of data available to data consumers.

Introducing the RDS Architecture

In the years since data warehouses became a commodity, much has changed. Distributed systems have gained great popularity, data is larger and more diverse, new database designs have popped up, and the advent of cloud has separated compute and storage for increased scalability and elasticity. Combine these trends with the challenges discussed in Chapter 1 and the decoupling principles you learned in Chapter 2, and you will immediately understand the importance of changing the way large volumes of data are distributed and shared.

RDS Architecture is the first data distribution and integration architecture and by far the most interesting because it’s the cornerstone of the new Scaled Architecture. It’s positioned for intensive reads and provides managed and secure access to consolidated data for a large variety of workloads. The architecture ...

Get Data Management at Scale now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.