Preface
Serverless has become a major selling point of cloud service providers. Over the last four years, hundreds of services from both major cloud providers and smaller service offerings have been branded or rebranded as “serverless.” Clearly, serverless has something to do with services provided over a network, but what is serverless, and why does it matter? How does it differ from containers, functions, or cloud native technologies? While terminology and definitions are constantly evolving, this book aims to highlight the essential attributes of serverless technologies and explain why the serverless moniker is growing in popularity.
This book primarily focuses on serverless compute systems; that is, systems that execute user-defined software, rather than performing a fixed-function system like storage, indexing, or message queuing. (Serverless storage systems exist as well, but they aren’t the primary focus of this book!) With that said, the line between fixed-function storage and general-purpose compute is never as sharp and clear as theory would like—for example, database systems that support the SQL query syntax combine storage, indexing, and the execution of declarative query programs written in SQL. While the architecture of fixed-function systems can be fascinating and important to understand for performance tuning, this book primarily focuses on serverless compute because it’s the interface with the most degrees of freedom for application authors, and the system that they are most likely to interact with day-to-day.
If you’re still not quite sure what serverless is, don’t worry. Given the number of different products on the market, it’s clear that most people are in the same boat. We’ll chart the evolution of the “serverless” term in “Background” of the Preface and then lay out a precise definition in Chapter 1.
Who Is This Book For?
The primary audience for this book is software engineers1 and technologists who are either unfamiliar with serverless or are looking to deepen their understanding of the principles and best practices associated with serverless architecture.
New practitioners who want to immediately dive into writing serverless applications can start in Chapter 2, though I’d recommend Chapter 1 for additional orientation on what’s going on and why serverless matters. Chapter 3 provides additional practical material to develop a deeper understanding of the architecture of the Knative platform used in the examples.
The order of the chapters should be natural for readers who are familiar with serverless. Chapters 5 and 6 provide a checklist of standard patterns for applying serverless, while Chapter 8 and onward provides a sort of “bingo card” of serverless warning signs and solution sketches that may be handy on a day-to-day basis. Chapter 11’s historical context also provides a map of previous technology communities to examine for patterns and solutions.
For readers who are more interested in capturing the big-picture ideas of serverless, Chapters 1, 4, and 7 have some interesting gems to inspire deeper understanding and new ideas. Chapter 11’s historical context and future predictions may also be of interest in understanding the arc of software systems that led to the current implementations of scale-out serverless offerings.
For readers who are new not only to serverless computing, but also to backend or cloud native development, the remainder of this preface will provide some background material to help set the stage. Like much of software engineering, these areas move quickly, so the definitions I provide here may have changed somewhat by the time you read this book. When in doubt, these keywords and descriptions may save some time when searching for equivalent services in your environment of choice.
Background
Over the last six years, the terms “cloud native,” “serverless,” and “containers” have all been subject to successive rounds of hype and redefinition, to the point that even many practitioners struggle to keep up or fully agree on the definitions of these terms. The following sections aim to provide definitions of some important reference points in the rest of the book, but many of these definitions will probably continue to evolve—take them as general reference context for the rest of this book, but not as the one true gospel of serverless computing. Definitions change as ideas germinate and grow, and the gardens of cloud native and serverless over the last six years have run riot with new growth.
Also note that this background is organized such that it makes sense when read from beginning to end, not as a historical record of what came first. Many of these areas developed independently of one another and then met and combined after their initial flowering (replanting ideas from one garden into another along the way).
Containers
Containers—either Docker or Open Container Initiative (OCI) format—provide a mechanism to subdivide a host machine into multiple independent runtime environments. Unlike virtual machines (VMs), container environments share a single OS kernel, which provides a few benefits:
- Reduced OS overhead, because only one OS is running
-
This limits containers to running the same OS as the host, typically Linux. (Windows containers also exist but are much less commonly used.)
- Simplified application bundles that run independently of OS drivers and hardware
-
These bundles are sufficient to run different Linux distributions on the same kernel with consistent behavior across Linux versions.
- Greater application visibility
-
The shared kernel allows monitoring application details like open file handles that would be difficult to extract from a full VM.
- A standard distribution mechanism for storing a container in an OCI registry
-
Part of the container specification describes how to store and retrieve a container from a registry—the container is stored as a series of filesystem layers stored as a compressed TAR (tape archive) such that new layers can add and delete files from the underlying immutable layers.
Unlike any of the following technologies, container technologies on their own benefit the running of applications on a single machine, but don’t address distributing an application across more than one machine. In the context of this book, containers act as a common substrate to enable easily distributing an application that can be run consistently on one or multiple computers.
Cloud Providers
Cloud providers are companies that sell remote access to computing and storage services. Popular examples include Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). Compute and storage services include VMs, blob storage, databases, message queues, and more custom services. These companies rent access to the services by the hour or even on a finer-grained basis, making it easy for companies to get access to computing power when needed without having to invest and plan datacenter space, hardware, and networking investments up front.
While some cloud computing services are basically “rent a slice of hardware,” the cloud providers have also competed on developing more complex managed services, either hosted on individual VMs per customer or using a multitenant approach in which the servers themselves are able to separate the work and resources consumed by different customers within a single application process. It’s harder to build a multitenant application or service, but the benefit is that it becomes much easier to manage and share server resources among customers—and reducing the cost of running the service means better margins for cloud providers.
The serverless computing patterns described in this book were largely developed either by the cloud providers themselves or by customers who provided guidance and feedback on what would make services even more attractive (and thus worth a higher price premium). Regardless of whether you’re using a proprietary single-cloud service or self-hosting a solution (see the next sections for more details as well as Chapter 3), cloud providers can offer an attractive environment for provisioning and running serverless applications.
Kubernetes and Cloud Native
While cloud providers started by offering compute as virtualized versions of physical hardware (so-called infrastructure as a service, or IaaS), it soon became clear that much of the work of securing and maintaining networks and operating systems was repetitive and well suited to automation. An ideal solution would use containers as a repeatable way to deploy software, running on bulk-managed Linux operating systems with “just enough” networking to privately connect the containers without exposing them to the internet at large. I explore the requirements for this type of system in more detail in “Infrastructure Assumptions”.
A variety of startups attempted to build solutions in this space with moderate success: Docker Swarm, Apache Mesos, and others. In the end, a technology introduced by Google and contributed to by Red Hat, IBM, and others won the day—Kubernetes. While Kubernetes may have had some technical advantages over the competing systems, much of its success can be attributed to the ecosystem that sprang up around the project.
Not only was Kubernetes donated to a neutral foundation (the Cloud Native Computing Foundation, or CNCF), but it was soon joined by other foundational projects including gRPC and observability frameworks, container packaging, database, reverse proxy, and service mesh projects. Despite being a vendor-neutral foundation, the CNCF and its members advertised and marketed this suite of technologies effectively to win attention and developer mindshare, and by 2019, it was largely clear that the Kubernetes + Linux combination would be the preferred infrastructure container platform for many organizations.
Since that time, Kubernetes has evolved to act as a general-purpose system for controlling infrastructure systems using a standardized and extensible API model. The Kubernetes API model is based on custom resource definitions (CRDs) and infrastructure controllers, which observe the state of the world and attempt to adjust the world to match a desired state stored in the Kubernetes API. This process is known as reconciliation, and when properly implemented, it can lead to resilient and self-healing systems that are simpler to implement than a centrally orchestrated model.
The technologies related to Kubernetes and other CNCF projects are called “cloud native” technologies, whether they are implemented on VMs from a cloud provider or on physical or virtual hardware within a user’s own organization. The key features of these technologies are that they are explicitly designed to run on clusters of semi-reliable computers and networks and to gracefully handle individual hardware failures while remaining available for users. By contrast, many pre-cloud-native technologies were built on the premise of highly available and redundant individual hardware nodes where maintenance would generally result in planned downtime or an outage.
Cloud-Hosted Serverless
While a rush has occurred in the last five years to rebrand many cloud-provider technologies as “serverless,” the term originally referred to a set of cloud-hosted technologies that simplified service deployment for developers. In particular, serverless allowed developers focused on mobile or web applications to implement a small amount of server-side logic without needing to understand, manage, or deploy application servers (hence the name). These technologies split into two main camps:
- Backend as a service (BaaS)
-
Structured storage services with a rich and configurable API for managing the stored state in a client. Generally, this API included a mechanism for storing small-to-medium JavaScript Object Notation (JSON) objects in a key-value store with the ability to send device push notifications when an object was modified on the server. The APIs also supported defining server-side object validation, automatic authentication and user management, and mobile-client-aware security rules. The most popular examples were Parse (acquired by Facebook, now Meta, in 2013 and became open source in 2017) and Firebase (acquired by Google in 2014).
While handy for getting a project started with a small team, BaaS eventually ran into a few problems that caused it to lose popularity:
-
Most applications eventually outgrew the fixed functionality. While adopting BaaS might provide an initial productivity boost, it almost certainly guaranteed a future storage migration and rewrite if the app became popular.
-
Compared with other storage options, it was both expensive and had limited scaling. While application developers didn’t need to manage servers, many of the implementation architectures required a single frontend server to avoid complex object-locking models.
-
- Function as a service (FaaS)
-
In this model, application developers wrote individual functions that would be invoked (called) when certain conditions were met. In some cases, this was combined with BaaS to solve some of the fixed-function problems, but it could also be combined with scalable cloud-provider storage services to achieve much more scalable architectures. In the FaaS model, each function invocation is independent and may occur in parallel, even on different computers. Coordination among function invocations needs to be handled explicitly using transactions or locks, rather than being handled implicitly by the storage API as in BaaS. The first widely popular implementation of FaaS was AWS Lambda, launched in 2014. Within a few years, most cloud providers offered similar competing services, though without any form of standard APIs.
Unlike IaaS, cloud-provider FaaS offerings are typically billed per invocation or per second of function execution, with a maximum duration of 5 to 15 minutes per invocation. Billing per invocation can result in very low costs for infrequently used functions, as well as favorable billing for bursty workloads that receive thousands of requests and are then idle for minutes or hours. To enable this billing model, cloud providers operate multitenant platforms that isolate each user’s functions from one another despite running on the same physical hardware within a few seconds of one another.
By around 2019, “serverless” had mostly come to be associated with FaaS, as BaaS had fallen out of favor. From that point, the serverless moniker began to be used for noncompute services, which worked well with the FaaS billing model: charging only for access calls and storage used, rather than for long-running server units. We’ll discuss the differences between traditional serverful and serverless computing in Chapter 1, but this new definition allows the notion of serverless to expand to storage systems and specialized services like video transcoding or AI image recognition.
While the definitions of “cloud provider” or “cloud native software” mentioned have been somewhat fluid over time, the serverless moniker has been especially fluid—a serverless enthusiast from 2014 would be quite confused by most of the services offered under that name eight years later.
One final note of disambiguation: 5G telecommunications networking has introduced the confusing term “network function as a service,” which is the idea that long-lived network routing behavior such as firewalls could run as a service on a virtualized platform that is not associated with any particular physical machine. In this case, the term “network function” implies a substantially different architecture with long-lived but mobile servers rather than a serverless distributed architecture.
How This Book Is Organized
This book is divided into four main parts.2 I tend to learn by developing a mental model of what’s going on, then trying things out to see where my mental model isn’t quite right, and finally developing deep expertise after extended usage. The parts correspond to this model—those in Table P-1.
Part | Chapter | Description |
---|---|---|
Definitions and descriptions of what serverless platforms offer. |
||
Building by learning: a stateless serverless application on Knative. |
||
A deep dive into implementing Knative, a serverless compute system. |
||
This chapter frames the serverless movement in terms of business value. |
||
With an understanding of serverless under our belt, this chapter explains how to apply the patterns from Chapter 2 to existing applications. |
||
Events are a common patterns for orchestrating stateless applications. This chapter explains various patterns of event-driven architecture. |
||
While Chapter 6 covers connecting events to an application, this chapter focuses specifically on building a serverless application that natively leverages events. |
||
After four chapters of cheerleading for serverless, this chapter focuses on patterns that can frustrate a serverless application architecture. |
||
Following Chapter 8’s warnings about serverless antipatterns, this chapter chronicles operational obstacles to serverless nirvana. |
||
While Chapter 9 focuses on the spectacular meltdowns, this chapter covers debugging tools needed to solve regular, everyday application bugs. |
||
Historical context for the development of the serverless compute abstractions. |
Conventions Used in This Book
The following typographical conventions are used in this book:
- Italic
-
Indicates new terms, URLs, email addresses, filenames, and file extensions.
Constant width
-
Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.
Constant width bold
-
Shows commands or other text that should be typed literally by the user.
Constant width italic
-
Shows text that should be replaced with user-supplied values or by values determined by context.
Tip
This element signifies a tip or suggestion.
Note
This element signifies a general note.
Warning
This element indicates a warning or caution.
Using Code Examples
Supplemental material (code examples, exercises, etc.) is available for download at https://oreil.ly/BSAK-supp.
This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.
We appreciate, but generally do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Building Serverless Applications on Knative by Evan Anderson (O’Reilly). Copyright 2024 Evan Anderson, 978-1-098-14207-0.”
If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com.
O’Reilly Online Learning
Note
For more than 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.
Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit https://oreilly.com.
How to Contact Us
Please address comments and questions concerning this book to the publisher:
- O’Reilly Media, Inc.
- 1005 Gravenstein Highway North
- Sebastopol, CA 95472
- 800-889-8969 (in the United States or Canada)
- 707-829-7019 (international or local)
- 707-829-0104 (fax)
- support@oreilly.com
- https://www.oreilly.com/about/contact.html
We have a web page for this book, where we list errata, examples, and any additional information. Access this page at https://oreil.ly/BuildingServerlessAppsKnative.
For news and information about our books and courses, visit https://oreilly.com.
Find us on LinkedIn: https://linkedin.com/company/oreilly-media.
Follow us on Twitter: https://twitter.com/oreillymedia.
Watch us on YouTube: https://youtube.com/oreillymedia.
Acknowledgments
This book has been at least three years in the making.3 In many ways, it’s also the result of my interactions with serverless users and builders, including authors of multiple serverless platforms as well as the broad and welcoming Knative community. This is the village that sprouted this book. This book would still be a sprout without the help of the many folks who’ve contributed to this finished product.
First are my editors at O’Reilly, who helped me through the process of actually publishing a book. John Devins first reached out to me in April 2022 about the possibility of writing a book and helped me through the early process of writing and polishing the initial proposal. Shira Evans was a patient and persistent partner as my development editor: helping me find the motivation to write and get unstuck when I slipped behind schedule, recommending additional content even when it pushed out the schedule, and providing constant optimism that this book would get done. Finally, Clare Laylock was my production editor, helping me see the book through to a finished product even after I’d read it a dozen times.
My technical reviewers provided incisive (and kind!) feedback on early drafts of this book. Without their perspectives, this book would be more tedious and less accurate. Even when the insight meant that I needed to write another section, I had the confidence that those were words that mattered. Celeste Stinger highlighted a number of security insights that had been given short shrift in my initial writing. Jess Males brought a strong sense of urgency and practical reliability thinking to refining my early drafts. And this book owes an immense debt to Joe Beda, both for having set me on the cloud native path that led me here as well as for forcing me to spell out more clearly the trade-offs of a number of exciting ideas and highlighting the parts that needed more celebration.
While the named contributors helped to make this book better, my family provided the support and motivation needed to stick with this effort. The idea of writing a book from my own experience was first kindled in my imagination by my father’s work writing a linear algebra textbook he could enjoy teaching from. My son, Erik, and my daughter, Eleanor, have been exceptionally patient finding their own entertainment as I sit in common spaces typing away on the laptop “working on the book,” and have made appreciative noises at how nice the preview PDFs look when I review them. Most of all, this book couldn’t have happened without Emily, my wife and partner, who has been my first sounding board, a helpful distraction for the kids when I needed focus time, and realistic assessor of what I can get done when I say I want to do it all. And now here’s one more thing I’ve gotten done.
1 Including operationally focused engineers like site reliability engineers (SREs) or DevOps practitioners.
2 The last part is one chapter because I couldn’t resist adding some historical footnotes in Chapter 11.
3 Or six years if you count the start of my work on Knative. Or fourteen if we count my first use of Google App Engine in anger.
Get Building Serverless Applications on Knative now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.