Chapter 2. A Definition of Distributed SQL
In Chapter 1, we established the development and business motivation for distributed SQL solutions. We identified the core feature set: ease of scale, resilience, data locality, SQL language support, and ACID compliance. The distributed mindset of scale, resilience, and locality is at the heart of distributed SQL solutions that fit these features.
To define distributed SQL, we’ll highlight the technical components with a brief overview of three key distributed database concepts and Google’s seminal white paper “Spanner: Google’s Globally Distributed Database.” Then we’ll investigate how emerging players take distributed SQL to polished product readiness by incorporating enterprise requirements. By the end of this chapter, we’ll have a definition that includes both nerd-friendly tech functions and state-of-the-art product features currently on the market.
The Key Concepts
Before we define distributed SQL, we should describe a few key concepts that drive this new technology. The first is CAP theorem, which identifies the functional trade-offs that distributed databases will need to make. The second is distributed consensus, which ensures that when data is stored on multiple machines, those machines reach agreement on a single version of the truth. In distributed SQL databases, this is most often implemented through the Raft algorithm. The third key concept is multiversion concurrency control (MVCC), ensuring that transactions in the database ...
Get What Is Distributed SQL? now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.