Chapter 1. REST Basics
Representational State Transfer (REST) is an architectural style first laid out in the dissertation of a man named Roy Fielding at the University of California Irvine, just a few miles from Monterey Park, CA, where I live (not that it matters—it’s just a fun fact for me).
REST is a set of constraints based on the architectural style of the World Wide Web. Writing this book in 2008, I don’t need to go into much detail about the success of the Web; it is a ubiquitous system for hypermedia and applications built on hypermedia. In this chapter, we’ll examine the basics of the REST architecture and its constraints, which are based on resource design and uniform interface interaction. This chapter is an introduction to the concepts of REST, and the remainder of the book will concentrate on applying those concepts to building RESTful services using Windows Communication Foundation (WCF).
Architecture of the World Wide Web
The success of the Web can be attributed in part to luck and timing, but some of the credit for its success can be attributed to its architecture. The architecture of the Web is based on a few fundamental principles that have taken it from its small beginnings to the large mass of information and functionality that exists today. These principles include:
Addressable resources
Standard resource formats
A uniform interface for interacting with those resources
Statelessness in the interaction between clients and services
Hyperlinking to enable navigation between resources
Everything on the Web is addressable. Uniform Resource Identifiers (URIs) are used to define the locations of particular resources. Resources can be things like HTML documents, images, or other media types. Addressability is one of the important parts of the Web’s success. How easy is it for us to find things on the Web based on partial knowledge of URIs? How many advertisements or commercials have a URI placed prominently for our consumption? The fact that you can take a URI from an advertisement, type it into a browser, and have the browser return the information you wanted is actually pretty amazing.
Part of the power of the Web stems from the fact that the resources on the Web are standard media types. This makes it possible for vendors to build new web browsers (a.k.a. user agents) without having to ask any particular company or authority for permission. It means that programs and users can access a web server’s resources using any modern operating system and browser. There are certainly some real issues here in terms of the way different browsers interpret resources, but clearly those issues haven’t done much to stop the ubiquity of the Web.
Based on HTTP (Hypertext Transfer Protocol), the uniform interface of the Web also plays into this openness and interoperability. HTTP is an open and well-known protocol that defines a standard way for user agents to interact with both resources and the servers that produce the resources. These interactions are based on the verbs (or methods) that accompany each HTTP request.
GET
is probably the most commonly used and well-known verb, and
its name is descriptive of its effect. A GET
for a
particular URI returns a copy of the resource that URI represents. One of
the most important features of GET
requests is that the
result of a GET
can be cached. Caching
GET
requests also contributes to the scalability of the
Web. Another feature of GET
requests is that they are
considered safe, because according to the HTTP specification,
GET
should not cause any side effects—that is,
GET
should never cause a change to a resource.
Certainly, a resource might change between two GET
requests, but that should be an independent action on the part of the
service.
Note
Some site maintainers fail on this part of the uniform interface
and use GET
requests from a user agent to change a
resource. These are incorrect implementations, and those individuals
should have their web programming licenses revoked.
POST
, which indicates a request to create a new resource, is
probably the next most commonly used verb, and there are a whole host of
others that we will examine later in this chapter and throughout this
book.
HTTP and the Web were designed to be stateless. A stateless service is one that can process an incoming request based solely on the request itself. The concept of per-client state on the server isn’t part of the design of HTTP or the Web.
If a request from a particular user agent contains all of the state necessary to retrieve (or create) a resource, that request can be handled by any server in a farm of servers, thus creating a scalable, robust environment.
Statelessness also improves visibility into web applications. If a request contains everything needed for the server to make a proper reply, the request also contains all the data needed to track and report on that request. There is no need to go to some data source with some key and try to recreate the data that was used as part of a request in order to determine what went right, or what went wrong (this wouldn’t be ideal anyway, since that data may have changed in the meantime). Statelessness increases a web application’s manageability because the entire state of each request is contained in the request itself.
Hyperlinking between resources is also an important part of the Web’s success. The fact that one resource can link to another, enabling the user agent (often through its human driver, but sometimes not) to navigate between related resources, makes the Web interconnected in a very significant way.
The Web is the world’s largest, most scalable, and most interoperable distributed application. The success of the Web and the scalability of its architecture have led many people to want to build applications or services on top of it.
SOAP
Many individuals and organizations have tried to build on the success and scalability of the Web by describing architectures and creating toolkits for building services. Services are endpoints that can be consumed programmatically rather than by a person sitting at a computer driving an application like a web browser. The two main approaches used in these attempts have been either the SOAP protocol or the architectural style of REST.
Note
While a chapter on the subtle differences between protocols such as REST and POX (Plain Old XML over HTTP) might make for an interesting read, this chapter is more specifically focused on the architectural differences between REST and its main competitor, SOAP.
SOAP, which at one point in its history stood for Simple Object Access Protocol (before its acronym status was revoked in the 1.2 version of its specification), is what many developers think of when they hear the term web service. SOAP was born out of a coordinated attempt by many large vendors to create a standard around a programmatic Web.
In many ways, SOAP doesn’t follow the architecture of the Web at all. Although there are bindings for using SOAP over HTTP, many aspects of SOAP are at odds with the architecture of the Web.
Rather than focusing on URIs (which is the way of the Web), SOAP focuses on actions, which are essentially a thin veneer over a method call (although of course a SOAP client can’t assume a one-to-one relation between an action and a method call). In this and many other ways, SOAP is an interoperable cross-platform remote procedure call (RPC) system. SOAP-based services almost always have only one URI and many different actions. In some ways, actions are like the HTTP uniform interface, except that every single SOAP service creates new actions; this is about as un-uniform and variable as you can get.
When used over HTTP, SOAP limits itself to one part of the Web’s uniform interface:
POST
. This creates a limitation because results, even those that
are read-only, can’t be safely cached. In many SOAP services, most actions
should really use GET
as the verb because they simply
return read-only data. Because SOAP doesn’t use GET
, SOAP results cannot be cached because the infrastructure of the Web only supports
caching responses to GET
requests. To be honest, you
can’t really call a SOAP-based service a web service
since SOAP intentionally ignores much of the architecture of the Web. The
term “SOAP service” is probably a more accurate description.
When confronted with the fact that SOAP doesn’t follow the architecture of the Web, SOAP proponents will often point out that SOAP was designed to be used over many different protocols, not just HTTP. Because it is meant to be generic and used over many different protocols, SOAP can’t take advantage of many of the Web’s features since many of those features are particular to HTTP.
REST
REST is an architectural style for building services. This style is based on the architecture of the Web, a fact that creates a fairly sharp contrast between REST and SOAP. While SOAP goes out of its way to make itself protocol-independent, REST embraces the Web and HTTP. Although it’s certainly possible to use some or all of the principles of REST over other protocols, many of its benefits are greatest when used over HTTP.
Another significant contrast is that SOAP isn’t an architectural style at all. SOAP is a specification that sets out the technical details on how two endpoints can interact in terms of the message representation, and it doesn’t offer any architectural guidance or constraints. In contrast, REST services are built to follow the specific constraints of the REST architectural style.
Note
Services that follow this style are known as RESTful. Note that these architectural constraints are more what you’d call “guidelines” than actual rules. Some services will use all of these constraints, and some will use only some of the constraints.
In their book RESTful Web Services (O’Reilly), Leonard Richardson and Sam Ruby lay out something they call the Resource Oriented Architecture (ROA), which is a stricter set of rules for determining whether a service is really RESTful.
While SOAP services are based on a service-specific set of actions and a single URI, RESTful services model the interaction with user agents based on resources. Each resource is represented by a unique URI, and the user agent uses the uniform interface of HTTP to interact with a resource via that URI. Put another way, REST services are more concerned with nouns (e.g., resources) than verbs (e.g., HTTP methods or SOAP actions) since the design of a service is about the URIs rather than a custom interface.
Resources and URIs
The first thing to do when designing a RESTful service is to determine which resources you are going to expose. A resource is any information that you want to make available to others, such as:
All the movies playing in or near your zip code
The current price of a particular stock
All the photos Jon took on June 1, 2008
A list of all the products your company sells
As you can see, some resources are static, like pictures taken on a particular day in the past, and some resources are dynamic, like the movies playing in or near a particular zip code. Many resources are dynamic in nature, so having an addressable set of resources for your service doesn’t mean that you know all the particular resource instances when you sit down to design your service. A resource is a conceptual mapping to a particular entity or entity set that you want your service to be able to work with.
When designing a RESTful service, you will identify the resources that your service will expose and use. Once you’ve identified the resources you’ll map them to URIs.
URI design
One of the things I like most about RESTful services is the fact that all resources are uniquely identified by a URI. The capability to retrieve a resource via a unique address is one of the big reasons the Web has been so successful.
Additionally, the use of RESTful services builds on our existing experience in using the Web. Nothing is more satisfying than using a website that has nicely designed URIs (yes, websites can be as RESTful as web services can). The utility of well-designed URIs is fairly self-evident. You can appreciate this if, like me, you have “hacked” a URI on a website to find a particular resource, even if the page you started with had no hyperlink to that resource.
An excellent example of a website that employs this resource-URI association is Flickr (http://www.flickr.com). Flickr allows you to store, view, and share photos on the Web. Here are a few of the resources that Flickr exposes for me:
All Jon’s photos
All Jon’s photos from a particular date
All Jon’s photos in a named set
All Jon’s photos with a particular tag
Here are the corresponding URIs for those resources:
I think these are pretty good URIs (although I’d prefer it if I could put in the name of a set rather than using Flickr’s identifier for a named set). This URI design allows me to find easily whichever resources (photos) I want to see. For example, if I wanted to see all of my photos taken on January 1, 2008, I would request the resource at http://www.flickr.com/photos/jonflanders/archives/date-taken/2008/06/05/.
I mention Flickr in a book ostensibly about services, even though Flickr is a website, to emphasize two points. First, good URI design is important, as it can greatly increase the usability of a website (and therefore a RESTful service as well). Second, our human experience in using the Web can help us in designing and using RESTful services, which is one of the points in my “Why REST matters to me” list.
Note
The ironic thing about Flickr’s very RESTful URI design is that its programmatic API (which Flickr claims is based on REST) isn’t very RESTful at all from a URI point of view.
Flickr uses a design that is often referred to as a REST-RPC
hybrid because it uses GET
even when it modifies
a resource. Flickr doesn’t rely on the uniform interface to define
interactions with resources; it basically adds an action to the
Query string of GET
requests.
The idea behind REST is to design your URIs in a way that makes logical sense based on your resource set. The URIs should, if possible, make sense to any user looking at them. If they make sense to a user looking at the URIs, they will make sense to the program that consumes the URIs programmatically. When designing the associations between resources and URIs, it may be useful to map them as if you were designing a browsable website. Even if the URIs will never be entered into a browser, this type of mapping will be useful for the person or persons writing the code to consume your service. Human-readable URIs are not strictly required for a service to be considered RESTful; they are just generally helpful when testing and debugging.
Uniform Interface
In REST, resources are identified by a unique URI. This is one of the constraints of the REST architectural style. Another constraint limits how a user agent interacts with your resources. User agents only interact with resources using the prescribed HTTP verbs. The main verbs are what we call the uniform interface. The verb that is used in a request to a particular URI indicates to the service what the user agent would like to do. When using the REST architectural style we do not make up our own verbs, we use the verbs prescribed by the HTTP standard.
The four main verbs of the uniform interface are
GET
, POST
, PUT
,
and DELETE
. Recall that GET
is the verb that tells the service that the
user agent wishes to get a read-only representation of a resource.
DELETE
indicates that a client wishes to delete a resource.
POST
indicates the desire to create a new resource.
PUT
is typically used for modifying an existing resource. If,
however, the user agent has the knowledge to specify the URI for the new
resource, PUT
is used for resource creation. See
Figure 1-1.
What is the advantage of the uniform interface of REST over any other service creation architecture? Why is it a useful constraint?
One reason that the uniform interface is so useful is that it frees us from having to create a new interface every time a new service is created. Creating an interface for a service endpoint is the equivalent of creating a new API, and can be hard work. Even when the API has limited scope, it can be hard work. Whole books and research papers are written on the correct approach to creating a reusable API. Doing it properly is not a trivial exercise.
On a related note, when consuming REST-based services, you don’t have to learn a new API every time you want to use a new service. Instead, you have to determine the URIs and the format of the resources (more on this later in this chapter), as well as which parts of the uniform interface the URIs will allow you to use. In some ways, once you learn how to build and use one RESTful service, you’ve learned how to build and use them all.
Another benefit of the uniform interface is the comfort you can
take from the fact that GET
is always safe, and the
knowledge that the rest of the uniform interface’s verbs other than
POST
are idempotent.
Note
Idempotent means that the effect of doing something more than once will be the same as the effect of doing it only once.
You can call GET
on a service or resource as
many times as you want with no side effects. You can update a resource
over and over with no ill effects. Deleting a resource that has already
been deleted is a no-op. The only unsafe verb continues to be
POST
, and because the effect of
POST
is undefined by the HTTP specification, you’ll
need to decide when implementing a service what the exact effect of
POST
should be (see Chapter 4 for more
information about writing read/write services with REST).
Note
POST
is unsafe because there aren’t any rules
about what will happen when you do a POST
. The
service can really do anything when a POST
request
comes in, and the resource could be radically changed.
As well as being safe, GET
also allows
caching (see Chapter 11 for more
information about caching and its benefits). In order to scale, a
service has to be able to cache, and SOAP services, no matter what you
do with them, cannot be safely cached, even when the action is one that
is essentially read-only. This is because SOAP always uses
POST
, which can’t be cached at any level.
Another important point about the uniform interface is that not
every single resource has to implement the entire uniform interface. In
fact, in many cases the only part of the uniform interface you’ll
implement on a resource is GET
. If a resource already
exists, and will not be created, modified, or deleted by the user agent,
the only job of the RESTful service will be to return that resource in
response to a GET
request.
Hopefully you’re beginning to see the architectural constraints of REST to take shape. The constraints comprise a checklist for building a RESTful service. First, you decide what your resources are. Then you map those resources to URIs. For each of those URIs you determine which media type, or representation, you are going to accept and return.
Resource Representations
REST has no architectural constraints on physical representations of resources. This makes sense considering the varied needs of applications and users on the Web. A RESTful service’s resource type is technically known as its media type. The media type is always returned in an HTTP response as one of the HTTP headers (Content-Type).
The media type for your resources is variable, but there are a few pretty popular and commonly used ones.
XML
XML is probably the most popular format for representation of resources. It’s a well-known format, and there are libraries for processing XML on every mainstream platform. The formal media type for XML is application/xml (it used to be text/xml, but that media type has been deprecated).
When choosing XML as your data format, one of the things you’ll decide is whether to use a custom XML schema or one of the XML formats that has been standardized across applications.
RSS/Atom
Feeds are a popular beast on the Web today; they are usually associated with what are called feed readers, and with a particular kind of web application known as a web log (or just blog for short). Blogs (and other types of data exposed as feeds) syndicate (broadcast) their data, and feed readers consume that syndicated data.
The two XML schemas that are used for feed syndication are Really Simple Syndication (RSS) and the Atom Syndication Format. Atom is the more recent standard and seems to be winning the hearts and minds of most developers and companies. It is accompanied by the Atom Publishing Protocol (commonly known as APP or AtomPub), which is more than just a format specification, but is an additional set of constraints built on top of REST architecture. AtomPub dictates the media types for a service, as well as the required uniform interface implementation for content publishing. AtomPub has grown to be used in many different applications besides classic content publishing like blogs.
See Chapter 6 for more information about feeds, and Chapter 11 for an example of the usage of Atom in a nonBlog blog scenario.
The media type for RSS is application/rss+xml. Atom’s is application/atom+xml.
XHTML
Extensible Hypertext Markup Language (XHTML) is an HTML media type that is also valid XML. HTML is the media type (text/html) that has driven the human-readable Web for many years. HTML can be challenging to parse if you’ve ever tried it, since the rules about tags, closing tags, attributes, and so on are all very loose. XML, on the other hand, has a very strict set of format requirements. XHTML (application/xhtml+xml) is the merger of HTML and XML. It is primarily intended for display by a browser, but is easily parsed by an XML library. It is also fairly commonly used in programmatically accessible services. Some services are written to return XHTML to both browser and programmatic user agents.
JSON
JavaScript Object Notation (JSON) is a media type (application/json) that is a text-based resource format for representing programmatic data types. It’s a very simple and basic network data representation for objects.
Although often associated with the JavaScript language, JSON is actually used as a media type in many different programming languages and environments.
One of JSON’s selling points is its ease of use from JavaScript and Ajax-type browser-based applications. Another selling point is the size of the representation over the network. As a media type, XML tends to be much larger than the compact, terse format of JSON. Many services now return JSON exclusively, regardless of the media type requested by the user agent, even when the user agent isn’t an AJAX application in the browser. Chapter 7 covers more about JSON as a media type.
Other media types
The four media types discussed in this section are not exhaustive. There are many other media types such as binary media types and images. When building a RESTful service, you have great latitude to choose your media type based on the particular application you are building. If you aren’t sure about which media type to use, try viewing some microformats at http://www.microformats.org/. Microformats are standardized media types based on common usage and behaviors. The nice thing about choosing a microformat as your media type is that it will be more well known than an XML schema that you create on your own, since tools and libraries may already exist to aid you in working with those formats.
Implementing a Simple RESTful Service Example
To help you understand the concepts introduced in this chapter, let’s walk through an example that employs the basic steps of designing a RESTful service. For this example, we will use an easy-to-understand domain: a membership system that stores information about its users.
Resources
This user system will expose the following set of resources:
All users
A particular user delineated by the user’s unique identifier
This is a fairly simple set of resources, but it actually turns out that many real-life services include only a handful of resources. Of course, because a resource is a conceptual entity, there will likely be near infinite URIs based on those resources.
URIs and Uniform Interface
For our example service, I’m going to start with the relative segments of the URIs, and I’m going to use a simple template syntax (curly braces {}) to indicate parts of the URI that will be replaced by context-specific variables (such as user ID). Table 1-1 contains a listing of the different URIs and the parts of the uniform interface we will implement for each URI.
This service has a small surface area, but you can see that it implements all the parts of the uniform interface for the user resource.
Representations
If we were working with a hierarchy or linked data for the users, XHTML would be a good choice for resource representation, since it would allow us to link to related data. However, our example domain will not contain these types of links, so we will use a simpler custom XML format.
Notice that I’m using the term custom XML format instead of custom XML schema. XML schemas are another media type altogether. They are XML documents that provide constraints on the format of other XML media type instances. XML schemas are very important in the SOAP world; you might say they are essential, but they are optional in a RESTful service. If you want to create XML schemas for your representations and provide them to your consuming user agents, that’s fine. Nothing in the set of REST architectural constraints mandates it or forbids it.
Having metadata like XML schemas and Web Service Description Language (WSDL) is one of the features of SOAP services that people find very useful. The lack of such metadata in RESTful services is somewhat troubling to people who come from that world. In Chapter 9 we’ll examine the options for building up the client’s API for consuming a service that doesn’t expose a schema.
Interaction
Now that we have the basis for our RESTful service example, let’s examine the interaction that will occur between the user agent and the service.
If the service is deployed at the host example.com, the first interaction
(assuming there are no users yet) will be a POST
to
the /users URI to create a new user (see Figure 1-2).
The user agent will send an HTTP request using
POST
to the /users URI, passing
in the media type, as well as the resource it wishes to create as the
HTTP request body. Assuming there are no error conditions, the service
will return a 201 Created status code. It’s convention for a service to
return the newly created resource as the response to a
POST
. The service can also return a Location header,
which specifies the URI of the new resource. A user agent can make a
GET
request to the /users URI to
get a list of all the resources available, which at this point will be
one. This is shown in Figure 1-3.
Since we can GET
all the users, we should also
be able to GET
a specific user. A
GET
request to the URI that represents user 1 will
simply be a GET
request to
/users/1 (see Figure 1-4).
The last two parts of the uniform interface that this service
implements are PUT
and DELETE
.
Figure 1-5 shows a
PUT
request and Figure 1-6 shows
DELETE
.
Wrap-Up
One of the things I really enjoy about REST as an architecture is the exercise I just went through. When designing a RESTful service, first determine the resources that the service will expose. Next, determine how you will map those resources to URIs, and decide which part of the uniform interface each URI should implement. Finally, choose the resource format.
This set of steps follows the architectural constraints of REST, and can help you determine what the service should look like (URIs) and how it should behave (the uniform interface). The verbs are preset, so you can concentrate solely on the nouns (resources), and you don’t have to create a new API for every service. SOAP, on the other hand, provides no real guidelines for what a service should look like or do. Each of the actions are created out of nothing with no real guidance for what they should be. REST builds on knowledge that you already have about URIs, and tells you exactly what each of those URIs can potentially do by restricting you to the uniform interface. This is one of the design constraints of REST, and, if I can interject a little personal opinion into this chapter, it’s one that I enjoy.
Admittedly, there is still data variability in RESTful services, since REST does not impose constraints on resource media types. However, this lack of data constraints is outweighed by the great utility of the REST interface and addressing constraints.
Another benefit of using REST constraints is that it becomes easier to use with each service that you build. Once you learn REST, you can easily identify which parts of the architectural constraints are being used on a service, which makes it increasingly easy to determine which constraints you should use in the future.
Processes
One criticism some people have about REST is its lack of support for the concept of a processing endpoint that models a particular process. Services can sometimes expose functionality that either doesn’t seem to fit well within the concept of a resource or doesn’t seem to fit well within the semantics of the uniform interface. For example, consider a service that is designed to implement bank transfers from one account to another. Clearly, you can create each account as a separate resource and use the uniform interface to specify the operations that users can perform on each account. But what resource represents a transfer between two accounts?
This is really a matter of having the right point of view. If you view this type of operation as a function, it will not fall neatly into the REST model. You can, however, treat it as a temporary resource.
Note
In a typical distributed system, this type of operation would generally be wrapped in a transaction. Of course, REST doesn’t use the concept of transactions, but you could also represent transactions as resources.
This idea doesn’t resonate with some people, even when all the other parts of REST as an architecture do. This is a design decision you may encounter and be faced with. It also may be that you never will run into this kind of decision, or that you are completely happy with the idea of a transaction as a resource.
Some people look at this problem and decide to stick with SOAP
services. Others look at it and decide simply to overload on
POST
. And others try to push REST and the concept of
resources to their fullest, and will model everything as resources (even
processes).
Summary
This chapter discussed the basics of creating RESTful services and using REST as an architecture. There are some core tenets of REST that you’ll want to keep with you as you read through the book.
First, REST uses the same tenets for building services as the Web. Resources are named entities that we’d like to interact with. Resources are addressable using URIs. The interaction between our code and those URIs is done using the uniform interface. The constraints of the REST architectural style are simple, elegant, and easy to remember, and are the foundations with which arguably the world’s largest, most scalable distributed application was built.
REST employs architectural constraints for building services, and you are free to use as many or as few of the constraints as you like (although, if you only use a few, you may have to argue with purists if you want to advertise your service as RESTful).
Get RESTful .NET now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.