Patterns of Distributed Systems
Published by Pearson
A Hands-On Introduction to Paxos and Raft
- Hands-on experience building distributed systems
- Explanation of implementation patterns applicable to wide variety of systems ranging from databases to cloud services to blockchain
- References to mainstream distributed systems source code like Kafka, Cassandra, and Kubernetes
We use distributed systems every day. Cloud services like Amazon S3, Amazon EKS, and CosmosDB are all distributed systems. Products like Kafka, Kubernetes, Cassandra, Akka, or Blockchain are distributed systems as well.
Even if not everyone on a team is involved in building these kinds of systems, it is important to have some understanding of how distributed systems work. A deeper understanding of distributed systems helps us choose appropriate cloud services and use them effectively. This understanding also helps with troubleshooting and integration challenges when using a variety of platforms and frameworks.
Distributed systems provide a particular challenge to program. They often require multiple copies of data, which need to be kept synchronized. Yet we cannot rely on processing nodes working reliably, and network delays can easily lead to inconsistencies. Despite this, many organizations rely on a range of core distributed software handling data storage, messaging, system management, and compute capability. These systems face common problems, which they solve with similar solutions. This course recognizes and develops these solutions as patterns with which we can build up an understanding of how to better understand distributed system design.
What you’ll learn and how you can apply it
By the end of the live online course, you’ll understand:
- What a distributed system is and why distributed systems are needed
- Common problems in distributed systems and their solutions in the form of patterns
- Consensus algorithms like Paxos and Raft, which are basic building blocks of most cloud services or products
- Key concepts needed to understand the implementation of a wide range of systems such as databases, in-memory data grids, message brokers, and various cloud services
And you’ll be able to:
- Use the patterns to help with troubleshooting and integration when using common platforms
- Make the right choice for your organization by comparing different cloud services and products
- Understand “the meaning behind the words” when you read technical documentation of cloud services
- Develop an in-depth understanding of what goes on inside products like Kafka, Kubernetes, Cassandra, or various cloud services
This live event is for you because...
- You are an architect/developer designing architecture for modern digital platforms
- You are using various cloud services
- You build microservices architecture
Prerequisites
- Familiarity with Java (as the assignments will be in Java)
- Some experience with using message brokers like Kafka or a cloud service like Azure/AWS/GCP. (This is good to have. Even without this knowledge, the course will be useful.)
Course Set-up
- Have JDK 17 installed
- IntelliJ IDEA or Eclipse
Recommended Preparation
- Watch: “Distributed Systems In One Lesson” by Tim Burgland
Recommended Follow-up
- Read: Designing Data Intensive Applications by Martin Kleppman
- Read: Software Architecture: The Hard Parts by Neal Ford
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
DAY 1: (4 hours)
Segment 1: Why distribute (45 Minutes)
- Hard limits of disks, network and compute
- Probability of failures
- Defining Distributed Systems
- Problems faced by distributed systems
- Patterns Approach to understand distributed systems
Questions: 5 minutes
Break (10 minutes)
Segment 2: Patterns for handling failures (180 Minutes)
- Write Ahead Log (30 minutes)
- Exercise: Implement Write Ahead Log (20 minutes)
Break (10 minutes)
- Quorum (30 minutes)
- Exercise: Implement a key value store with Quorum (20 minutes)
Break (10 minutes)
- Generation Clock (30 Minutes)
- Exercise: Implement a Generation Clock (20 minutes)
Wrap-Up (10 Minutes)
Day 2: (4 hours)
Segment 3: Consensus - Paxos (120 minutes) Paxos (60 minutes)
- Understand why Quorum is not enough to achieve consensus
- How to use basic techniques of Generation Clock and Quorum to implement Paxos consensus algorithm
- How Paxos is used to achieve replica consistency in databases like Cassandra
Questions: 15 minutes
- Exercise: Implement a simple key value store with Paxos (30 minutes)
Break: 15 minutes
Segment 4: Consensus - Replicated Log (120 minutes)
- Replicated Log (60 minutes)
- Limitations of basic Paxos
- How basic technique of WAL be used for replication
- How Paxos can be extended to implement consensus over a log
- Implementation building blocks of Raft consensus algorithm
- Raft usage in products like Kafka, MongoDB, CockroachDB, etc.
- Exercise: Implement a key-value store with Replicated Log (30 minutes)
Questions: 15 minutes
Wrap-Up: 15 Minutes
Your Instructor
Unmesh Joshi
Unmesh Joshi is a principal consultant at ThoughtWorks. He’s a software architecture enthusiast who believes that understanding the principles of distributed systems is as essential today as understanding web architecture or object-oriented programming was in the last decade.