Chapter 1. Surfing the Web
The World Wide Web became popular because ordinary people can use it to do really useful things with minimal training. But behind the scenes, the Web is also a powerful platform for distributed computing.
The principles that make the Web usable by ordinary people also work when the âuserâ is an automated software agent. A piece of software designed to transfer money between bank accounts (or carry out any other real-world task) can accomplish the task using the same basic technologies a human being would use.
As far as this book is concerned, the Web is based on three technologies: the URL naming convention, the HTTP protocol, and the HTML document format. URL and HTTP are simple, but to apply them to distributed programming you must understand them in more detail than the average web developer does. The first few chapters of this book are dedicated to giving you this understanding.
The story of HTML is a little more complicated. In the world of web APIs, there are dozens of data formats competing to take the place of HTML. An exploration of these formats will take up several chapters of this book, starting in Chapter 5. For now, I want to focus on URL and HTTP, and use HTML solely as an example.
Iâm going to start off by telling a simple story about the World Wide Web, as a way of explaining the principles behind its design and the reasons for its success. The story needs to be simple because although youâre certainly familiar with the Web, you might not have heard of the concepts that make it work. I want you to have a simple, concrete example to fall back on if you ever get confused about terminology like âhypermedia as the engine of application state.â
Letâs get started.
Episode 1: The Billboard
One day Alice is walking around town and she sees a billboard (Figure 1-1).
(By the way, this fictional billboard advertises a real website that I designed for this book. You can try it out yourself.)
Alice is old enough to remember the mid-1990s, so she recalls the publicâs reaction when URLs started showing up on billboards. At first, people made fun of these weird-looking strings. It wasnât clear what âhttp://â or âyoutypeitwepostit.comâ meant. But 20 years later, everyone knows what to do with a URL: you type it into the address bar of your web browser and hit Enter.
And thatâs what Alice does: she pulls out her mobile phone and puts http://www.youtypeitwepostit.com/ in her browserâs address bar. The first episode of our story ends on a cliffhanger: whatâs at the other end of that URL?
Resources and Representations
Sorry for interrupting the story, but I need to introduce some basic terminology. Aliceâs web browser is about to send an HTTP request to a web serverâspecifically, to the URL http://www.youtypeitwepostit.com/. One web server may host many different URLs, and each URL grants access to a different bit of the data on the server.
We say that a URL is the URL of some thing: a product, a user, the home page. The technical term for the thing named by a URL is resource.
The URL http://www.youtypeitwepostit.com/ identifies a resourceâprobably the home page of the website advertised on the billboard. But you wonât know for sure until we resume the story and Aliceâs web browser sends the HTTP request.
When a web browser sends an HTTP request for a resource, the server sends a document in response (usually an HTML document, but sometimes a binary image or something else). Whatever document the server sends, we call that document a representation of the resource.
So each URL identifies a resource. When a client makes an HTTP request to a URL, it gets a representation of the underlying resource. The client never sees a resource directly.
Iâll talk a lot more about resources and representations in Chapter 3. Right now I just want to use the terms resource and representation to discuss the principle of addressability, to which Iâll now turn.
Addressability
A URL identifies one and only one resource. If a website has two conceptually different things on it, we expect the site to treat them as two resources with different URLs. We get frustrated when a website violates this rule. Websites for restaurants are especially bad about this. Frequently, the whole site is buried inside a Flash interface and thereâs no URL that points to the menu or to the map that shows where the restaurant is locatedâthings we would like to talk about on their own.
The principle of addressability just says that every resource should have its own URL. If something is important to your application, it should have a unique name, a URL, so that you and your users can refer to it unambiguously.
Episode 2: The Home Page
Back to our story. When Alice enters the URL from the billboard into her browserâs address bar, it sends an HTTP request over the Internet to the web server at http://www.youtypeitwepostit.com/:
GET / HTTP/1.1 Host: www.youtypeitwepostit.com
The web server handles this request (neither Alice nor her web browser need to know how) and sends a response:
HTTP/1.1 200 OK Content-type: text/html <!DOCTYPE html> <html> <head> <title>Home</title> </head> <body> <div> <h1>You type it, we post it!</h1> <p>Exciting! Amazing!</p> <p class="links"> <a href="/messages">Get started</a> <a href="/about">About this site</a> </p> </div> </body> </html>
The 200 at the beginning of the response is a status code, also
called a response code. Itâs a quick way for the server to tell the
client approximately what happened to the clientâs request. There are
a lot of HTTP status codes, and I cover them all in Appendix A, but
the most common one is the one you see here. 200 (OK
) means that the
request was fulfilled with no problems.
Aliceâs web browser decodes the response as an HTML document and displays it graphically (see Figure 1-2).
Now Alice can read the web page and understand what the billboard was talking about. It was advertising a microblogging site, similar to Twitter. Not as exciting as advertised on the billboard, but good enough as an example.
Aliceâs first real interaction with the web server reveals a couple more important features of the Web.
Short Sessions
At this point in the story, Aliceâs web browser is displaying the siteâs home page. From her perspective, sheâs âlandedâ on that page, which is is her current âlocationâ in cyberspace. But as far as the server is concerned, Alice isnât anywhere. The server has already forgotten about her.
HTTP sessions last for one request. The client sends a request, and the server responds. This means Alice could turn her phone off overnight, and when her browser restored the page from its internal cache, she could click on one of the two links on this page and it would still work. (Compare this to an SSH session, which is terminated if you turn your computer off.)
Alice could leave this web page open in her phone for six months, and when she finally clicks on a link, the web server would respond as if sheâd only waited a few seconds. The web server isnât sitting up late at night worrying about Alice. When sheâs not making an HTTP request, the server doesnât know Alice exists.
This principle is sometimes called statelessness. I think this is a confusing term because the client and the server in this system both keep state; they just keep different kinds of state. The term âstatelessnessâ is getting at the fact that the server doesnât care what state the client is in. (Iâll talk more about the different kinds of state in the following sections.)
Self-Descriptive Messages
Itâs clear from looking at the HTML that this site is more than just a home page. The markup for the home page contains two links: one to the relative URL /about (i.e., to http://www.youtypeitwepostit.com/about) and one to /messages (i.e., http://www.youtypeitwepostit.com/messages). At first Alice only knew one URLâthe URL to the home pageâbut now she knows three. The server is slowly revealing its structure to her.
We can draw a map of the website so far (Figure 1-3), as revealed to Alice by the server.
Whatâs on the other end of the /messages and /about links? The only way to be sure is to follow them and find out. But Alice can look at the HTML markup, or her browserâs graphical rendering of the markup, and make an educated guess. The link with the text âAbout this siteâ probably goes to a page talking about the site. Thatâs nice, but the link with the text âGet startedâ is probably the one that gets her closer to actually posting a message.
When you request a web page, the HTML document you receive doesnât just give you the immediate information you asked for. The document also helps you answer the question of what to do next.
Episode 3: The Link
After reading the home page, Alice decides to give this site a try. She clicks the link that says âGet started.â Of course, whenever you click a link in your web browser, youâre telling your web browser to make an HTTP request.
The code for the link Alice clicked on looks like this:
<a href="/messages">Get started</a>
So her browser makes this HTTP request to the same server as before:
GET /messages HTTP/1.1 Host: www.youtypeitwepostit.com
That GET in the request is an HTTP method, also known as an HTTP verb. The HTTP method is the clientâs way of telling the server what it wants to do to a resource. âGETâ is the most common HTTP method. It means âgive me a representation of this resource.â For a web browser, GET is the default. When you follow a link or type a URL into the address bar, your browser sends a GET request.
The server handles this particular GET request by sending a representation of /messages:
HTTP/1.1 200 OK Content-type: text/html ... <!DOCTYPE html> <html> <head> <title>Messages</title> </head> <body> <div> <h1>Messages</h1> <p> Enter your message below: </p> <form action="http://youtypeitwepostit.com/messages" method="post"> <input type="text" name="message" value="" required="true" maxlength="6"/> <input type="submit" value="Post" /> </form> <div> <p> Here are some other messages, too: </p> <ul> <li><a href="/messages/32740753167308867">Later</a></li> <li><a href="/messages/7534227794967592">Hello</a></li> </ul> </div> <p class="links"> <a href="http://youtypeitwepostit.com/">Home</a> </p> </div> </body> </html>
As before, Aliceâs browser renders the HTML graphically (Figure 1-4).
When Alice looks at the graphical rendering, she sees that this page is a list of messages other people have published on the site. Right at the top thereâs an inviting text box and a Post button.
Now weâve revealed a little more about how the server works. Figure 1-5 shows an updated map of the site, as seen by Aliceâs browser.
Standardized Methods
Both of Aliceâs HTTP requests used GET as their HTTP method. But thereâs a bit of HTML in the latest representation that will trigger an HTTP POST request if Alice clicks the Post button:
<form action="http://youtypeitwepostit.com/messages" method="post"> <input type="text" name="message" value="" required="true" maxlength="6"/> <input type="submit" /> </form>
The HTTP standard (RFC 2616) defines eight methods a client can apply to a resource. In this book, Iâll focus on five of them: GET, HEAD, POST, PUT, and DELETE. In Chapter 3, Iâll cover these methods in detail, along with an extension method, PATCH, designed specifically for use in web APIs. Right now the important thing to keep in mind is that there are a small number of standard methods.
Itâs not impossible to come up with a new HTTP method (it happened with PATCH), but itâs a very big deal. This is not like a programming language, where you can name your methods whatever you want. When I built the simple microblogging website for use in this example, I didnât define new HTTP methods like GETHOMEPAGE and HELLOPLEASESHOWMETHEMESSAGELISTTHANKSBYE. I used GET for both âshow the home pageâ and âshow the message list,â because in both cases GET (âgive me a representation of this resourceâ) was the best match between HTTPâs interface and what I wanted to do. I distinguished between the home page and the message list not by defining new methods, but by treating those two documents as separate resources, each with its own URL, each accessible through GET.
Episode 4: The Form and the Redirect
Back to our story. Alice is tempted by the form on the microblogging site. She types in âTestâ and clicks the Post button.:
Again, Aliceâs browser makes an HTTP request:
POST /messages HTTP/1.1 Host: www.youtypeitwepostit.com Content-type: application/x-www-form-urlencoded message=Test&submit=Post
And the server responds with the following:
HTTP/1.1 303 See Other Content-type: text/html Location: http://www.youtypeitwepostit.com/messages/5266722824890167
When Aliceâs browser made its two GET requests, the server sent the
HTTP status code 200 (âOKâ) and provided an HTML document for Aliceâs
browser to render. Thereâs no HTML document here, but the server did
provide a link to another URL, in the Location
headerâand here, the
status code at the beginning of the response is 303 (âSee Otherâ), not
200 (âOKâ).
Status code 303 tells Aliceâs browser to automatically make a fourth
HTTP request, to the URL given in the Location
header. Without
asking Aliceâs permission, her browser does just that:
GET /messages/5266722824890167 HTTP/1.1
This time, the server responds with 200 (âOKâ) and an HTML document:
HTTP/1.1 200 OK Content-type: text/html <!DOCTYPE html> <html> <head> <title>Message</title> </head> <body> <div> <h2>Message</h2> <dl> <dt>ID</dt><dd>2181852539069950</dd> <dt>DATE</dt><dd>2014-03-28T21:51:08Z</dd> <dt>MSG</dt><dd>Test</dd> </dl> <p class="links"> <a href="http://www.youtypeitwepostit.com/">Home</a> </p> </div> </body> </html>
Aliceâs browser displays this document graphically (Figure 1-6), and, finally, goes back to waiting for Aliceâs input.
Note
Iâm sure youâve encountered HTTP redirects before, but HTTP is full of small features like this, and some may be new to you. There are many ways for the server to tell the client to handle a response differently, and ways for the client to attach conditions or extra features to its request. A big part of API design is the proper use of these features. Chapter 11 covers the features of HTTP that are most important to web APIs, and Appendix A and Appendix B provide supplementary information on this topic.
By looking at the graphical rendering, Alice sees that her message (âTestâ) is now a fully fledged post on YouTypeItWePostIt.com. Our story ends hereâAlice has accomplished her goal of trying out the microblogging site. But thereâs a lot to be learned from these four simple interactions.
Application State
Figure 1-7 is a state diagram that shows Aliceâs entire adventure from the perspective of her web browser.
When Alice started up the browser on her phone, it didnât have any particular page loaded. It was an empty slate. Then Alice typed in a URL and a GET request took the browser to the siteâs home page. Alice clicked a link, and a second GET request took the browser to the list of messages. She submitted a form, which caused a third request (a POST request). The response to that was an HTTP redirect, which Aliceâs browser made automatically. Aliceâs browser ended up at a web page describing the message Alice had just created.
Every state in this diagram corresponds to a particular page (or to no page at all) being open in Aliceâs browser window. In REST terms, we call this bit of informationâwhich page are you on?âthe application state.
When you surf the Web, every transition from one application state to another corresponds to a link you decided to follow or a form you decided to fill out. Not all transitions are available from all states. Alice canât make her POST request directly from the home page, because the home page doesnât feature the form that allows her browser to construct the POST request.
Resource State
Figure 1-8 is a state diagram showing Aliceâs adventure from the perspective of the web server.
The server manages two resources: the home page (served from /
) and
the message list (served from /messages
). (The server also manages a
resource for each individual message, and one for the âAbout this siteâ page. Iâve omitted those resources
from the diagram for the sake of simplicity.) The state of these
resources is called, simply enough, resource state.
When the story begins, there are two messages in the message list: âHelloâ and âLater.â Sending a GET to the home page doesnât change resource state, since the home page is a static document that never changes. Sending a GET to the message list wonât change the state either.
But when Alice sends a POST to the message list, it puts the server in a new state. Now the message list contains three messages: âHello,â âLater,â and âTest.â Thereâs no way back to the old state, but this new state is very similar. As before, sending a GET to the home page or message list wonât change anything. But sending another POST to the message list will add a fourth message to the list.
Because HTTP sessions are so short, the server doesnât know anything about a clientâs application state. The client has no direct control over resource stateâall that stuff is kept on the server. And yet, the Web works. It works through RESTârepresentational state transfer.
Application state is kept on the client, but the server can manipulate it by sending representationsâHTML documents, in this caseâthat describe the possible state transitions. Resource state is kept on the server, but the client can manipulate it by sending the server a representationâan HTML form submission, in this caseâdescribing the desired new state.
Connectedness
In the story, Alice made four HTTP requests to YouTypeItWePostIt.com, and she got three HTML documents in return. Although Alice didnât follow every single link in those documents, we can use those links to build a rough map of the website from the clientâs perspective (Figure 1-9).
This is a web of HTML pages. The strands of the web are the HTML <a>
tags and <form>
tags, each describing a GET or POST HTTP request
Alice might decide to make. I call this the principle of
connectedness: each web page tells you how to get to the adjoining
pages.
The Web as a whole works on the principle of connectedness, which is better known as âhypermedia as the engine of application state,â sometimes abbreviated HATEOAS. I prefer âconnectednessâ or âthe hypermedia constraint,â because âhypermedia as the engine of application stateâ sounds intimidating. But at this point, you should have no reason to find it intimidating. You know what application state isâitâs which web page a client is on. Hypermedia is the general term for things like HTML links and forms: the techniques a server uses to explain to a client what it can do next.
To say that hypermedia is the engine of application state is to say that we all navigate the Web by filling out forms and following links.
The Web Is Something Special
Aliceâs story doesnât seem that exciting. because the World Wide Web has been the dominant Internet application for the past 20 years. But back in the 1990s, this was a very exciting story. If you compare the World Wide Web to its early competitors, youâll see the difference.
The Gopher protocol (defined in RFC 1436) looks a lot like HTTP, but it lacks addressability. There is no succinct way to identify a specific document in Gopherspace. At least there wasnât until the World Wide Web took pity on Gopherspace and released the URL standard (first defined in RFC 1738), which provides a gopher:// URL scheme that works just like http://.
FTP, a popular pre-Web protocol for file transfer (defined in RFC 959), also lacks addressability. Until RFC 1738 came along with its ftp:// URL scheme, there simply was no machine-readable way to point to a file on an FTP server. You had to use English prose to explain where the file was. It took the brainpower of a human being just to locate a file on a server. What a waste!
FTP also featured long-lived sessions. A casual user could log on to an FTP server and tie up one of the serverâs TCP connections indefinitely. By contrast, even a âpersistentâ HTTP connection shouldnât tie up a TCP connection for longer than 30 seconds.
The 1990s saw a lot of Internet protocols for searching different kinds of archives and databasesâprotocols like Archie, Veronica, Jughead, WAIS, and Prospero. But it turns out we donât need all those protocols. We just need to be able to send GET requests to different kinds of websites. All these protocols died out or were replaced by websites. Their complex protocol-specific rules were folded into the uniformity of HTTP GET.
Once the Web took over, it became a lot more difficult to justify creating a new application protocol. Why create a new tool that only techies will understand, when you can put up a website that anyone can use? All successful post-Web protocols do something the Web canât do: peer-to-peer protocols like BitTorrent and real-time protocols like SSH. For most purposes, HTTP is good enough.
The unprecedented flexibility of the Web comes from the principles of REST. In the 1990s, we discovered that the Web works better than its competition. In 2000, Roy T. Fieldingâs Ph.D dissertation[6] explained why this is, coining the term âRESTâ in the process.
Web APIs Lag Behind the Web
The Fielding dissertation also explains a lot about the problems of web APIs in the 2010s. The simple website I just walked you through is much more sophisticated than most currently deployed web APIsâeven self-proclaimed REST APIs. If youâve ever designed a web API, or written a client for one, youâve probably encountered some of these problems:
Web APIs frequently have human-readable documentation that explains how to construct URLs for all the different resources. This is like writing English prose explaining how to find a particular file on an FTP server. If websites did this, no one would bother to use the Web.
Instead of telling you what URLs to type in, websites embed URLs in
<a>
tags and<form>
tagsâhypermedia controls that you can activate by clicking a link or a button.In REST terms, putting information about URL construction in separate human-readable documents violates the principles of connectedness and self-descriptive messages.
Lots of websites have help docs, but when was the last time you used them? Unless thereâs a serious problem (you bought something and it was never delivered), itâs easier to click around and figure out how the site works by exploring the connected, self-descriptive HTML documents it sends you.
Todayâs APIs present their resources in a big menu of options instead of an interconnected web. This makes it difficult to see what one resource has to do with another.
Integrating with a new API inevitably requires writing custom software, or installing a one-off library written by someone else. But you donât need to write custom software to use a new website. You see a URL on a billboard and plug it into your web browserâthe same client you use for every other website in the world.
Weâll never get to the point where a single API client can understand every API in the world. But todayâs clients contain a lot of code that really ought to be refactored out into generic libraries. This will only become possible when APIs serve self-descriptive representations.
When APIs change, custom API clients break and have to be fixed. But when a website undergoes a redesign, the siteâs users grumble about the redesign and then they adapt. Their browsers donât stop working.
In REST terms, the website redesign is entirely encapsulated in the self-descriptive HTML documents served by the website. A client that could understand the old HTML documents can understand the new ones.
These are the problems Iâm trying to solve with this book. The good news is that it used to be a lot worse. A few years ago, it was common to see RESTful APIs that used safe HTTP methods in unsafe ways, or mixed up application and resource state. This doesnât happen much anymore. Designs have gotten better, and they can get better still.
The Semantic Challenge
Now for the bad news. The story Iâve told you, the story of Aliceâs trip through a website, went as smoothly as it did thanks to a very slow and expensive piece of hardware: Alice herself. Every time her browser rendered a web page, Alice, a human being, had to look at the rendered page and decide what to do next. The Web works because human beings make all the decisions about which links to click and which forms to fill out.
The whole point of web APIs is to get things done without making a
human sit in front of a web browser all day. How can we program a
computer to make the decisions about which links to click? A computer
can parse the HTML markup <a href="/messages">Get started</a>
, but
it canât understand the phrase âGet started.â Why bother to design
APIs that serve self-descriptive messages if those messages wonât be
understood by their software consumers?
This is the biggest challenge in web API design: bridging the semantic gap between understanding a documentâs structure and understanding what it means. As a shorthand, Iâm going to call it the semantic challenge. Very little progress has been made on the semantic challenge, and we will never solve it completely. The good news is that because so little progress has been made so far, the first bit of progress is really easy. We just have to start working together, instead of duplicating each otherâs work.
Iâll be checking in with the semantic challenge over the next few chapters, as I talk about the technologies of the Web and how you can use them in API designs. By Chapter 8, weâll have the tools necessary to tackle the semantic challenge head-on.
[6] Fielding, Roy Thomas. Architectural Styles and the Design of Network-based Software Architectures. Doctoral dissertation, University of California, Irvine, 2000.
Get RESTful Web APIs now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.