Chapter 9. The Building Blocks of Services
Throughout this book I’ve said that web services are based on three fundamental technologies: HTTP, URIs, and XML. But there are also lots of technologies that build on top of these. You can usually save yourself some work and broaden your audience by adopting these extra technologies: perhaps a domain-specific XML vocabulary, or a standard set of rules for exposing resources through HTTP’s uniform interface. In this chapter I’ll show you several technologies that can improve your web services. Some you’re already familiar with and some will probably be new to you, but they’re all interesting and powerful.
Representation Formats
What representation formats should your service actually send and receive? This is the question of how data should be represented, and it’s an epic question. I have a few suggestions, which I present here in a rough order of precedence. My goal is to help you pick a format that says something about the semantics of your data, so you don’t find yourself devising yet another one-off XML vocabulary that no one else will use.
I assume your clients can accept whatever representation format you serve. The known needs of your clients take priority over anything I can say here. If you know your data is being fed directly into Microsoft Excel, you ought to serve representations in Excel format or a compatible CSV format. My advice also does not extend to document formats that can only be understood by humans. If you’re serving audio files, I’ve got nothing to say about which audio format you should choose. To a first approximation, a programmed client finds all audio files equally unintelligible.
XHTML
Media type: application/xhtml+xml
The common text/html
media type is deprecated for XHTML. It’s also the only media type that
Internet Explorer handles as HTML. If your service might be serving
XHTML data directly to web browsers, you might want to serve it as
text/html
.
My number-one representation recommendation is the format I’ve been using in my own services throughout this book, and the one you’re probably most familiar with. HTML drives the human web, and XHTML can drive the programmable web. The XHTML standard (http://www.w3.org/TR/xhtml1/) relies on the HTML standard to do most of the heavy lifting (http://www.w3.org/TR/html401/).
XHTML is HTML under a few restrictions that make every XHTML document also valid XML. If you know HTML, you know most of what there is to know about XHTML, but there are some syntactic differences, like how to present self-closing tags. The tag names and attributes are the same: XHTML is expressive in the same ways as HTML. Since the XHTML standard just points to the HTML standard and then adds some restrictions to it, I tend to refer to “HTML tags” and the like except where there really is a difference between XHTML and HTML.
I don’t actually recommend HTML as a representation format, because it can’t be reliably parsed with an XML parser. There are many excellent and liberal HTML parsers, though (I mentioned a few in Chapter 2), so your clients have options if you can’t or don’t want to serve XHTML. Right now, XHTML is a better choice if you expect a wide variety of clients to handle your data.
HTML can represent many common types of data: nested
lists (tags like ul
and
li
), key-value pairs (the dl
tag and its
children), and tabular data (the table
tag and
its children). It supports many different kinds of hypermedia. HTML
does have its shortcomings: its hypermedia forms are limited, and
won’t fully support HTTP’s uniform interface until HTML 5 is
released.
HTML is also poor in semantic content. Its tag vocabulary is
very computer-centric. It has special tags for representing computer
code and output, but nothing for the other structured fruits of human
endeavor, like poetry. One resource can link to another resource, and
there are standard HTML attributes (rel
and
rev
) for expressing the
relationship between the linker and the linkee. But the HTML standard
defines only 15 possible relationships between resources, including
“alternate,” “stylesheet,” “next,” “prev,” and “glossary.” See http://www.w3.org/TR/html401/types.html#type-links for
a complete list.
Since HTML pages are representations of resources, and resources
can be anything, these 15 relationships barely scratch the surface.
HTML might be called upon to represent the relationship between any
two things. Of course, I can come up with my own values for rel
and rev
to supplement the official 15, but if
everyone does that confusion will reign: we’ll all pick different
values to represent the same relationships. If I link my web page to
my wife’s web page, should I specify my relationship to her as
husband, spouse, or sweetheart? To a human it doesn’t matter much, but
to a computer program (the real client on the programmable web) it
matters a lot. Similarly, HTML can easily represent a list, and
there’s a standard HTML attribute (class
) for expressing what kind of list it
is. But HTML doesn’t say what kinds of lists there are.
This isn’t HTML’s fault, of course. HTML is supposed to be used by people who work in any field. But once you’ve chosen a field, everyone who works in that field should be able to agree on what kinds of lists there are, or what kinds of relationships can exist between resources. This is why people have started getting together and adding standard semantics to XHTML with microformats.
XHTML with Microformats
Media type: application/xhtml+xml
Microformats are
lightweight standards that extend XHTML to give domain-specific
semantics to HTML tags. Instead of reinventing data storage techniques
like lists, microformats use existing HTML tags like ol
, span
,
and abbr
. The semantic content
usually lives in custom values for the attributes of the tags, such as
class
, rel
, and rev
. Example 9-1 shows an example: someone’s home
telephone number represented in the microformat known as hCard.
<span class="tel"> <span class="type">home</span>: <span class="value">+1.415.555.1212</span> </span>
Microformat adoption is growing, especially as more special-purpose devices get on the web. Any microformat document can be embedded in an XHTML page, because it is XHTML. A web service can serve an XHTML representation that contains microformat documents, along with links to other resources and forms for creating new ones. This document can be automatically parsed for its microformat data, or rendered for human consumption with a standard web browser.
As of the time of writing there were nine microformat
specifications. The best-known is probably rel-nofollow
, a standard value for the
rel
attribute invented by engineers
at Google as a way of fighting comment spam on weblogs. Here’s a
complete list of official microformats:
- hCalendar
A way of representing events on a calendar or planner. Based on the IETF iCalendar format.
- hCard
A way of representing contact information for people and organizations. Based on the vCard standard defined in RFC 2426.
- rel-license
A new value for the
rel
attribute, used when linking to the license terms for a XHTML document. For example:<a href="http://creativecommons.org/licenses/by-nd/" rel="license"> Made avaliable under a Creative Commons Attribution-NoDerivs license. </a>
That’s standard XHTML. The only thing the microformat does is define a meaning for the string
license
when it shows up in therel
attribute.- rel-nofollow
A new value for the
rel
attribute, used when linking to URIs without necessarily endorsing them.- rel-tag
A new value for the
rel
attribute, used to label a web page according to some external classification system.- VoteLinks
A new value for the
rev
attribute, an extension of the idea behindrel-nofollow
. VoteLinks lets you say how you feel about the resource you’re linking to by casting a “vote.” For instance:<a rev="vote-for" href="http://www.example.com">The best webpage ever.</a> <a rev="vote-against" href="http://example.com/"> A shameless ripoff of www.example.com</a>
- XFN
Stands for XHTML Friends Network. A new set of values for the
rel
attribute, for capturing the relationships between people. An XFN value for therel
attribute captures the relationship between this “person” resource and another such resource. To bring back the “Alice” and “Bob” resources from Relationships Between Resources” in Chapter 8, an XHTML representation of Alice might include this link:<a rel="spouse" href="Bob">Bob</a>
- XMDP
Stands for XHTML Meta Data Profiles. A way of describing your custom values for XHTML attributes, using the XHTML tags for definition lists:
DL
,DD
, andDT
. This is a kind of meta-microformat: a microformat likerel-tag
could itself be described with an XMDP document.- XOXO
Stands (sort of) for Extensible Open XHTML Outlines. Uses XHTML’s list tags to represent outlines. There’s nothing in XOXO that’s not already in the XHTML standard, but declaring a document (or a list in a document) to be XOXO signals that a list is an outline, not just a random list.
Those are the official microformat standards; they should give you an idea of what microformats are for. As of the time of writing there were also about 10 microformat drafts and more than 50 discussions about possible new microformats. Here are some of the more interesting drafts:
- geo
A way of marking up latitude and longitude on Earth. This would be useful in the mapping application I designed in Chapter 5. I didn’t use it there because there’s still a debate about how to represent latitude and longitude on other planetary bodies: extend geo or define different microformats for each body?
- hAtom
A way of representing in XHTML the data Atom represents in XML.
- hResume
- hReview
A way of representing reviews, such as product reviews or restaurant reviews.
- xFolk
A way of representing bookmarks. This would make an excellent representation format for the social bookmarking application in Chapter 7. I chose to use Atom instead because it was less code to show you.
You get the idea. The power of microformats is that they’re based on HTML, the most widely-deployed markup format in existence. Because they’re HTML, they can be embedded in web pages. Because they’re also XML, they can be embedded in XML documents. They can be understood at various levels by human beings, specialized microformat processors, dumb HTML processors, and even dumber XML processors.
Even if the microformats wiki shows no microformat standard or draft for your problem space, you might find an open discussion on the topic that helps you clarify your data structures. You can also create your own microformat (see Ad Hoc XHTML” later in this chapter).
Atom
Media type: application/atom+xml
Atom is an XML vocabulary for describing lists of timestamped entries. The entries can be anything, but they usually contain pieces of human-authored text like you’d see on a weblog or a news site. Why should you use an Atom list instead of a regular XHTML list? Because Atom provides special tags for conveying the semantics of publishing: authors, contributors, languages, copyright information, titles, categories, and so on. (Of course, as I mentioned earlier, there’s a microformat called hAtom that brings all of these semantics into XHTML.) Atom is a useful XML vocabulary because so many web services are, in the broad sense, ways of publishing information. What’s more, there are a lot of web service clients that understand the semantics of Atom documents. If your web service is addressable and your resources expose Atom representations, you’ve immediately got a huge audience.
Atom lists are called feeds, and the items in the lists are called entries.
Tip
Some feeds are written in some version of RSS, a different XML vocabulary with similar semantics. All versions of RSS have the same basic structure as Atom: a feed that contains a number of entries. There are a number of variants of RSS but you shouldn’t have to worry about it at all. Today, every major tool for consuming feeds understands Atom.
These days, most weblogs and news sites expose a special resource whose representation is an Atom feed. The entries in the feed describe and link to other resources: weblog entries or news stories published on the site. You, the client, can consume these resources with a feed reader or some other external program. In Chapter 7, I represented lists of bookmarks as Atom feeds. Example 9-2 shows a simple Atom feed document.
<?xml version="1.0" encoding="utf-8"?> <feed xmlns="http://www.w3.org/2005/Atom"> <title>RESTful News</title> <link rel="alternate" href="http://example.com/RestfulNews" /> <updated>2007-04-14T20:00:39Z</updated> <author><name>Leonard Richardson</name></author> <contributor><name>Sam Ruby</name></contributor> <id>urn:1c6627a0-8e3f-0129-b1a6-003065546f18</id> <entry> <title>New Resource Will Respond to PUT, City Says</title> <link rel="edit" href="http://example.com/RestfulNews/104" /> <id>urn:239b2f40-8e3f-0129-b1a6-003065546f18</id> <updated>2007-04-14T20:00:39Z</updated> <summary> After long negotiations, city officials say the new resource being built in the town square will respond to PUT. Earlier criticism of the proposal focused on the city's plan to modify the resource through overloaded POST. </summary> <category scheme="http://www.example.com/categories/RestfulNews" term="local" label="Local news" /> </entry> </feed>
In that example you can see some of the tags that convey the
semantics of publishing: author
,
title
, link
, summary
, updated
, and so on. The feed as a whole is a
joint project: it has an author
tag
and a contributor
tag. It’s also
got a link
tag that points to an
alternate URI for the underlying “feed” resource: the news site. The
single entry has no author
tag, so
it inherits author information from the feed. The entry does have its
own link
tag, which points to
http://www.example.com/RestfulNews/104. That URI
identifies the entry as a resource in its own right. The entry also
has a textual summary of the story. To get the remainder, the client
must presumably GET the entry’s URI.
An Atom document is basically a directory of published
resources. You can use Atom to represent photo galleries, albums of
music (maybe a link to the cover art plus one to each track on the
album), or lists of search results. Or you can omit the LINK
tags and use Atom as a container for
original content like status reports or incoming emails. Remember: the
two reasons to use Atom are that it represents the semantics of
publishing, and that a lot of existing clients can consume it.
If your application almost fits in with the Atom schema, but needs an extra tag or two, there’s no problem. You can embed XML tags from other namespaces in an Atom feed. You can even define a custom namespace and embed its tags in your Atom feeds. This is the Atom equivalent of XHTML microformats: your Atom feeds can use conventions not defined in Atom, without becoming invalid. Clients that don’t understand your tag will see a normal Atom feed with some extra mysterious data in it.
OpenSearch
OpenSearch is
one XML vocabulary that’s commonly embedded in Atom
documents. It’s designed for representing lists of search results.
The idea is that a service returns the results of a query as an Atom
feed, with the individual results represented as Atom entries. But
some aspects of a list of search results can’t be represented in a
stock Atom feed: the total number of results, for instance. So
OpenSearch defines three new elements, in the opensearch
namespace:[28]
totalResults
itemsPerPage
How many items are returned in a single “page” of search results.
startindex
If all the search results are numbered from zero to
totalResults
, then the first result in this feed document is entry numberstartindex
. When combined withitemsPerPage
you can use this to figure out what “page” of results you’re on.
SVG
Media type: image/svg+xml
Most graphic formats are just ways of laying pixels out on the screen. The underlying content is opaque to a computer: it takes a skilled human to modify a graphic or reuse part of one in another. Scalable Vector Graphics is an XML vocabulary that makes it possible for programs to understand and manipulate graphics. It describes graphics in terms of primitives like shapes, text, colors, and effects.
It would be a waste of time to represent a photograph in SVG, but using it to represent a graph, a diagram, or a set of relationships gives a lot of power to the client. SVG images can be scaled to arbitrary size without losing any detail. SVG diagrams can be edited or rearranged, and bits of them can be seamlessly snipped out and incorporated into other graphics. In short, SVG makes graphic documents work like other sorts of documents. Web browsers are starting to get support for SVG: newer versions of Firefox support it natively.
Form-Encoded Key-Value Pairs
Media type: application/x-www-form-urlencoded
I covered this simple format in Chapter 6. This format is mainly used in representations the client sends to the server. A filled-out HTML form is represented in this format by default, and it’s an easy format for an Ajax application to construct. But a service can also use this format in the representations it sends. If you’re thinking of serving comma-separated values or RFC 822-style key-value pairs, try form-encoded values instead. Form-encoding takes care of the tricky cases, and your clients are more likely to have a library that can decode the document.
JSON
Media type: application/json
JavaScript Object Notation is a serialization format for general data structures. It’s much more lightweight and readable than an equivalent XML document, so I recommend it for most cases when you’re transporting a serialized data structure rather than a hypermedia document.
I introduced JSON in JSON Parsers: Handling Serialized Data” in Chapter 2, and showed a simple JSON document in Example 2-11. Example 9-3 shows a more complex JSON document: a hash of lists.
As I show in Chapter 11, JSON has special advantages when it comes to Ajax applications. It’s useful for any kind of application, though. If your data structures are more complex than key-value pairs, or you’re thinking of defining an ad hoc XML format, you might find it easier to define a JSON structure of nested hashes and arrays.
RDF and RDFa
The Resource Description
Framework is a way of representing knowledge about resources.
Resource here means the same thing as in
Resource-Oriented-Architecture: a resource is anything important
enough to have a URI. In RDF, though, the URIs might not be http:
URIs. Abstract URI schemas like
isbn:
(for books) and urn:
(for just about anything) are common.
Example 9-4 is a simple RDF assertion, which claims that
the title of this book is RESTful Web
Services.
<span about="isbn:9780596529260" property="dc:title"> RESTful Web Services </span>
There are three parts to an RDF assertion, or triple, as they’re called.
There’s the subject, a resource identifier:
in this case, isbn:9780596529260
.
There’s the predicate, which identifies a
property of the resource: in this case, dc:title
. Finally there’s the object, which is the value
of the property: in this case, “RESTful Web Services.” The assertion
as a whole reads: “The book with ISBN 9780596529260 has a title of
‘RESTful Web Services.’”
I didn’t make up the isbn:
URI space: it’s a standard way of addressing books as resources. I
didn’t make up the dc:title
predicate, either. That comes from the Dublin Core
Metadata Initiative. DCMI defines a set of useful predicates
that apply to published works like books and weblogs. An automated
client that understands the Dublin Core can scan RDF documents that
use those terms, evaluate the assertions they contain, and even make
logical deductions about the data.
Example 9-4 looks a lot like an XHTML snippet, because that’s what it is. There are a couple ways of representing RDF assertions, and I’ve chosen to show you RDFa, a microformat-like standard for embedding RDF in XHTML. RDF/XML is a more popular RDF representation format, but I think it makes RDF look more complicated than it is, and it’s difficult to integrate RDF/XML documents into the web. RDF/A documents can go into XHTML files, just like microformat documents. However, since RDFa takes some ideas from the unreleased XHTML 2 standard, a document that includes it won’t be valid XHTML for a while. A third way of representing RDF assertions is eRDF, which results in valid XHTML.
RDF in its generic form is the basis for the W3C’s Semantic Web project. On the human web, there are no standards for how we talk about the resources we link to. We describe resources in human language that’s difficult or impossible for machines to understand. RDF is a way of constraining human speech so that we talk about resources using a standard vocabulary—not one that machines “understand” natively, but one they can be programmed to understand. A computer program doesn’t understand the Dublin Core’s “dc:title” any more than it understands “title.” But if everyone agrees to use “dc:title,” we can program standard clients to reason about the Dublin Core in consistent ways.
Here’s the thing: I think microformats do a good job of adding semantics to the web we already have, and they add less complexity than RDF’s general subject-predicate-object form. I recommend using RDF only when you want interoperability with existing RDF processors, or are treating RDF as a general-purpose microformat for representing assertions about resources.
One very popular use of RDF is FOAF, a way of representing information about human beings and the relationships between them.
Framework-Specific Serialization Formats
Media type: application/xml
I’m talking here about informal XML vocabularies used by frameworks like Ruby’s ActiveRecord and Python’s Django to serialize database objects as XML. I gave an example back in Example 7-4. It’s a simple data structure: a hash or a list of hashes.
These representation formats are very convenient if you happen
to be writing a service that gives you access to one. In Rails, you
can just call to_xml
on an ActiveRecord object or a list of such objects. The
Rails serialization format is also useful if you’re not using Rails,
but you want your service to be usable by ActiveResource clients.
Otherwise, I don’t really recommend these formats, unless you’re just
trying to get something up and running quickly (as I am in Chapters
7 and
12). The
major downside of these formats is that they look like documents, but
they’re really just serialized data structures. They never contain
hypermedia links or forms.
Ad Hoc XHTML
Media type: application/xhtml+xml
If none of the work that’s already been done fits your problem space... well, first, think again. Just as you should think again before deciding you can’t fit your resources into HTTP’s uniform interface. If you think your resources can’t be represented by stock HTML or Atom or RDF or JSON, there’s a good chance you haven’t looked at the problem in the right way.
But it’s quite possible that your resources won’t fit any of the representation formats I’ve mentioned so far. Or maybe you can represent most of your resource state with XHTML plus some well-chosen microformats, but there’s still something missing. The next step is to consider creating your own microformat.
The high-impact way of creating a microformat is to go through the microformat process, hammer it out with other microformat enthusiasts, and get it published as an official microformat. This is most appropriate when lots of people are trying to represent the same kind of data. Ideally, you’re in a situation where the human web is littered with ad hoc HTML representations of the data, and where there are already a couple of big standards that can serve as a model for a more agile microformat. This is how the hCard and hCalendar microformats were developed. There were many people trying to put contact information and upcoming events on the human web, and preexisting standards (vCard and iCalendar) to steal ideas from. The representation of “places on a map” that I devised in Chapter 5 might be a starting point for an official microformat. There are lots of mapping sites on the human web, and lots of heavyweight standards for representing GIS data. If I wanted to build a microformat, I’d have a lot of ideas to work from.
The low-impact way of creating a microformat is to add semantic content to the XHTML you were going to write anyway. This is suitable for representation formats that no one else is likely to use, or as a starting point so you can get a real web service running while you’re going through the microformat process. The representation of the list of planets from Chapter 5 works better as an ad hoc set of semantics than as an official microformat. All it’s doing is saying that one particular list is a list of planets.
The microformat design patterns and naming principles give a set of sensible general rules for adding semantics to HTML. Their advice is useful even if you’re not trying to create an official microformat. The semantics you choose for your “micromicroformat” won’t be standardized, but you can present them in a standard way: the way microformats do it. Here are some of the more useful patterns.
If there’s an HTML tag that conveys the semantics you want, use it. To represent a set of key-value pairs, use the
dl
tag. To represent a list, use one of the list tags. If nothing fits, use thespan
ordiv
tag.Give a tag additional semantics by specifying its
class
attribute. This is especially important forspan
anddiv
, which have no real meaning on their own.Use the
rel
attribute in a link to specify another resource’s relationship to this one. Use therev
attribute to specify this page’s relationship to another one. If the relationship is symmetric, userel
. See Hypermedia Technologies” later in this chapter for more on this.Consider providing an XMDP file that describes your custom values for
class
,rel
, andrev
.
Other XML Standards and Ad Hoc Vocabularies
Media type: application/xml
In addition to XHTML, Atom, and SVG, there are a lot of specialized XML vocabularies I haven’t covered: MathML, OpenDocument, Chemical Markup Language, and so on. There are also specialized vocabularies you can use in RDF assertions, like Dublin Core and FOAF. A web service might serve any of these vocabularies as standalone representations, embed them into Atom feeds, or even wrap them in SOAP envelopes. If none of these work for you, you can define a custom XML vocabulary to represent your resource state, or maybe the parts that Atom doesn’t cover.
Although I’ve presented this as the last resort, that’s certainly not the common view. People come up with custom XML vocabularies all the time: that’s how there got to be so many of them. Almost every real web service mentioned in this book exposes its representations in a custom XML vocabulary. Amazon S3, Yahoo!’s search APs, and the del.icio.us API all serve representations that use custom XML vocabularies, even though they could easily serve Atom or XHTML and reuse an existing vocabulary.
Part of this is tech culture. The microformats idea is fairly new, and a custom XML vocabulary still looks more “official.” But this is an illusion. Unless you provide a schema definition for your vocabulary, your custom tags have exactly the same status as a custom value for the HTML “class” attribute. Even a definition does nothing but codify the vocabulary you made up: it doesn’t confer any legitimacy. Legitimacy can only come “from the consent of the governed”: from other people adopting your vocabulary.
That said, there is a space for custom XML vocabularies. It’s usually easy to use XHTML instead of creating your own XML tags, but it’s not so easy when you need tags with a lot of custom attributes. In that situation, a custom XML vocabulary makes sense. All I ask is that you seriously think about whether you really need to define a new XML vocabulary for a given problem. It’s possible that in the future, people will err in the opposite direction, and create ad hoc microformats when they shouldn’t. Then I’ll urge caution before creating a microformat. But right now, the problem is too many ad hoc XML vocabularies.
Encoding Issues
It’s a global world (I actually heard someone say that once), and any service you expose must deal with the products of people who speak different languages from you and use different writing systems. You don’t have to understand all of these languages, but to handle multilingual data without mangling it, you do need to know something about character encodings: the conventions that let us represent human-readable text as strings of bytes.
Every text file you’ve ever created has some character encoding, even though you probably never made a decision about which encoding to use (it’s usually a system property). In the United States the encoding is usually UTF-8, US-ASCII, or Windows-1252. In western Europe it might also be ISO 8859-1. The default for HTML on the web is ISO 8859-1, which is almost but not quite the same as Windows-1252. Japanese documents are commonly encoded with EUC-JP, Shift_JIS, or UTF-8. If you’re curious about what character encodings are used in different places, most web browsers list the encodings they understand. My web browser supports five different encodings for simplified Chinese, five for Hebrew, nine for the Cyrillic alphabet, and so on. Most of these encodings are mutually incompatible, even when they encode the same language. It’s insane!
Fortunately there is a way out of this confusion. We as a species have come up with Unicode, a way of representing every human writing system. Unicode isn’t a character encoding, but there are two good encodings for it: UTF-8 (more efficient for alphabetic languages like English) and UTF-16 (more efficient for logographic languages like Japanese). Either of these encodings can handle text written in any combination of human languages. The best single decision you can make when handling multilingual data is to keep all of your data in one of these encodings: probably UTF-8 unless you live or do a lot of business in east Asia, then maybe UTF-16 with a byte-order mark.
This might be as simple as making a decision when you start the
project, or you may have to convert an existing database. You might
have to install an encoding converter to work on incoming data, or
write encoding detection code. (The Universal Encoding
Detector is an excellent autodetection library for Python.) It’s
got a Ruby port, available as the chardet
gem. It might be easy or difficult.
But once you’re keeping all of this data in one of the Unicode
encodings, most of your problems will be over. When your clients send
you data in a weird encoding, you’ll be able to convert it to your
chosen UTF-* encoding. If they send data that specifies no format at
all, you’ll be able to guess its encoding and convert it, or reject it
as unintelligible.
The other half of the equation is communicating with your clients: how do you tell them which encoding you’re using in your outgoing representations? Well, XML lets you specify a character encoding on the very first line:
<?xml version="1.0" encoding="UTF-8"?>
All but one of my recommended representation formats is based on XML, so that solves most of the problem. But there is an encoding problem with that one outlier, and there’s a further problem in the relationship between XML and HTTP.
XML and HTTP: Battle of the encodings
An XML document can and should define a character encoding in
its first line, so that the client will know how to interpret the
document. An HTTP response can and should specify a value for the
Content-Type
response header, so
that the client knows it’s being given an XML document and not some
other kind. But the Content-type
can also specify a document character encoding with “charset,” and
this encoding might conflict with what it actually says in the
document.
Content-Type: application/xml; charset="ebcdic-fr-297+euro" <?xml version="1.0" encoding="UTF-8"?>
Who wins? Surprisingly, HTTP’s character encoding takes
precedence over the encoding in the document itself.[29]If the document says “UTF-8” and Content-Type
says
“ebcdic-fr-297+euro,” then extended French EBCDIC it is. Almost no
one expects this kind of surprise, and most programmers write code
first and check the RFCs later. The result is that the character
encoding, as specified in Content-Type
, tends to be unreliable. Some
servers claim everything they serve is UTF-8, even though the actual
documents say otherwise.
When serving XML documents, I don’t recommend going out of
your way to send a character encoding as part of Content-type
. You can do it if you’re
absolutely sure you’ve got the right encoding, but it won’t do much
good. What’s really important is that you specify a document
encoding. (Technically you can do without a document encoding if
you’re using UTF-8, or UTF-16 with a byte-order mark. But if you
have that much control over the data, you should be able to specify
a document encoding.) If you’re writing a web service client, be
aware that any character encoding specified in Content-Type
may be incorrect. Use common
sense to decide which encoding declaration to believe, rather than
relying on a counterintuitive rule from an RFC a lot of people
haven’t read.
Another note: when you serve XML documents, you should serve
them with a media type of application/xml
, not text/xml
. If you serve a document
as text/xml
with no
charset
, the correct client
behavior is to totally ignore the encoding specified in the XML
document and interpret the XML document as US-ASCII.[30]Avoid these complications altogether by always serving
XML as application/xml
, and
always specifying an encoding in the first line of the XML documents
you generate.
The character encoding of a JSON document
I didn’t mention plain text in my list of recommended representation formats, mostly because plain text is not a structured format, but also because the lack of structure means there’s no way to specify the character encoding of “plain text.” JSON is a way of structuring plain text, but it doesn’t solve the character encoding problem. Fortunately, you don’t have to solve it yourself: just follow the standard convention.
RFC 4627 states that a JSON file must contain Unicode characters, encoded in one of the UTF-* encodings. Practically, this means either UTF-8, or UTF-16 with a byte-order mark. Plain US-ASCII will also work, since ASCII text happens to be valid UTF-8. Given this restriction, a client can determine the character encoding of a JSON document by looking at the first four bytes (the details are in RFC 4627), and there’s no need to specify an explicit encoding. You should follow this convention whenever you serve plain text, not just JSON.
Prepackaged Control Flows
Not only does HTTP have a uniform interface, it has a standard set of response codes—possible ways a request can turn out. Though resources can be anything at all, they usually fall into a few broad categories: database tables and their rows, publications and the articles they publish, and so on. When you know what sort of resource a service exposes, you can often anticipate the possible responses to an HTTP request without knowing too much about the resource.
In one sense the standard HTTP response codes (see Appendix B) are just a suggested control flow: a set of instructions about what to do when you get certain kinds of requests. But that’s pretty vague advice, and we can do better. Here I present several prepackaged control flows: patterns that bring together advice about resource design, representation formats, and response codes to help you design real-world services.
General Rules
These snippets of control flow can be applied to almost any service. I can make very general statements about them because they have nothing to do with the actual nature of your resources. All I’m doing here is picking out a few important HTTP status codes and telling you when to use them.
You should be able to implement these rules as common code that runs before your normal request handling. In Example 7-11 I implemented most of them as Rails filters that run before certain actions, or as Ruby methods that short-circuit a request unless a certain condition is met.
If the client tries to do something without providing the
correct authorization, send a response code of 401 (“Unauthorized”)
along with instructions for correctly formatting the Authorization
header.
If the client tries to access a URI that doesn’t correspond to any existing resource, send a response code of 404 (“Not Found”). The only possible exception is when the client is trying to PUT a new resource to that URI.
If the client tries to use a part of the uniform interface that a resource doesn’t support, send a response code of 405 (“Method Not Allowed”). This is the proper response when the client tries to DELETE a read-only resource.
Database-Backed Control Flow
In many web services there’s a strong connection between a resource and something in a SQL database: a row in the database, a table, or the database as a whole. These services are so common that entire frameworks like Rails are oriented to making them easy to write. Since these services are similar in design, it makes sense that their control flows should also be similar.
For instance, if an incoming request contains a nonsensical representation, the proper response is almost certainly 415 (“Unsupported Media Type”) or 400 (“Bad Request”). It’s up to the application to decide which representations make sense, but the HTTP standard is pretty strict about the possible responses to “nonsensical representation.”
With this in mind, I’ve devised a standard control flow for the uniform interface in a database-backed application. It runs on top of the general rules I mentioned in the previous section. I used this control flow in the controller code throughout Chapter 7. Indeed, if you look at the code in that chapter you’ll see that I implemented the same ideas multiple times. There’s space in the REST ecosystem for a higher-level framework that implements this control flow, or some improved version of it.
GET
If the resource can be identified, send a representation along with a response code of 200 (“OK”). Be sure to support conditional GET!
PUT
If the resource already exists, parse the representation and turn it into a series of changes to the state of this resource. If the changes would leave the resource in an incomplete or inconsistent state, send a response code of 400 (“Bad Request”).
If the changes would cause the resource state to conflict with some other resource, send a response code of 409 (“Conflict”). My social bookmarking service sends a response code of 409 if you try to change your username to a name that’s already taken.
If there are no problems with the proposed changes, apply them
to the existing resource. If the changes in resource state mean that
the resource is now available at a different URI, send a response
code of 301 (“Moved Permanently”) and include the new URI in
the Location
header. Otherwise,
send a response code of 200 (“OK”). Requests to the old URI should now result
in a response code of 301 (“Moved Permanently”), 404 (“Not Found”), or 410 (“Gone”).
There are two ways to handle a PUT request to a URI that doesn’t correspond to any resource. You can return a status code of 404 (“Not Found”), or you can create a resource at that URI. If you want to create a new resource, parse the representation and use it to form the initial resource state. Send a response code of 201 (“Created”). If there’s not enough information to create a new resource, send a response code of 400 (“Bad Request”).
POST for creating a new resource
Parse the representation, pick an appropriate URI, and create a new resource
there. Send a response code of 201 (“Created”) and include the URI
of the new resource in the Location
header. If there’s not enough
information provided to create the resource, send a response code of
400 (“Bad Request”). If the provided resource state would conflict
with some existing resource, send a response code of 409
(“Conflict”), and include a Location
header that points to the
problematic resource.
POST for appending to a resource
Parse the representation. If it doesn’t make sense, send a response code of 400 (“Bad Request”). Otherwise, modify the resource state so that it incorporates the information in the representation. Send a response code of 200 (“OK”).
The Atom Publishing Protocol
Earlier I described Atom as an XML vocabulary that describes the semantics of publishing: authors, summaries, categories, and so on. The Atom Publishing Protocol (APP) defines a set of resources that capture the process of publishing: posting a story to a site, editing it, assigning it to a category, deleting it, and so on.
The obvious applications for the APP are those for Atom and online publishing in general: weblogs, photo albums, content management systems, and the like. The APP defines four kinds of resources, specifies some of their behavior under the uniform interface, and defines the representation documents they should accept and serve. It says nothing about URI design or what data should go into the documents: that’s up to the individual application.
The APP takes HTTP’s uniform interface and puts a higher-level uniform interface on top of it. Many kinds of applications can conform to the APP, and a generic APP client should be able to access all of them. Specific applications can extend the APP by exposing additional resources, or making the APP resources expose more of HTTP’s uniform interface, but they should all support the minimal features mentioned in the APP standard.
The ultimate end of the APP is to serve Atom documents to the end user. Of course, the Atom documents are just the representations of underlying resources. The APP defines what those resources are. It defines two resources that correspond to Atom documents, and two that help the client find and modify APP resources.
Collections
An APP collection is a resource whose representation is an Atom feed. The document in Example 9-2 has everything it takes to be a representation of an Atom collection. There’s no necessary difference between an Atom feed you subscribe to in your feed reader, and an Atom feed that you manipulate with an APP client. A collection is just a list or grouping of pieces of data: what the APP calls members. The APP is heavily oriented toward manipulating “collection” type resources.
The APP defines a collection’s response to GET and POST requests. GET returns a representation: the Atom feed. POST adds a new member to the collection, which (usually) shows up as a new entry in the feed. Maybe you can also DELETE a collection, or modify its settings with a PUT request. The APP doesn’t cover that part: it’s up to your application.
Members
An APP collection is a collection of members. A member corresponds roughly to an entry in an Atom feed: a weblog entry, a news article, or a bookmark. But a member can also be a picture, song, movie, or Word document: a binary format that can’t be represented in XML as part of an Atom document.
A client creates a member inside a collection by POSTing a
representation of the member to the collection URI. This pattern
should be familiar to you by now: the member is created as a
subordinate resource of the collection. The server assigns the new
member a URI. The response to the POST request has a response code
of 201 (“Created”), and a Location
header that lets the client know
where to find the new resource.
Example 9-5 shows an Atom entry
document: a representation of a member. This is the same sort of
entry
tag I showed you in Example 9-2, presented as a standalone XML document.
POSTing this document to a collection creates a new member, which
starts showing up as a child of the collection’s feed
tag. A document like this one might
be how the entry
tag in Example 9-2 got where it is today.
<?xml version="1.0" encoding="utf-8"?> <entry> <title>New Resource Will Respond to PUT, City Says</title> <summary> After long negotiations, city officials say the new resource being built in the town square will respond to PUT. Earlier criticism of the proposal focused on the city's plan to modify the resource through overloaded POST. </summary> <category scheme="http://www.example.com/categories/RestfulNews" term="local" label="Local news" /> </entry>
Service document
This vaguely-named type of resource is just a grouping of collections. A typical move is
to serve a single service document, listing all of your collections,
as your service’s “home page.” A service document is an XML document
written using a particular vocabulary, and its media type is
application/atomserv+xml
(see
Example 9-6).
Example 9-6 shows a representation of a typical service document. It describes three collections. One of them is a weblog called “RESTful news,” which accepts a POST request if the representation is an Atom entry document like the one in Example 9-5. The other two are personal photo albums, which accept a POST request if the representation is an image file.
<?xml version="1.0" encoding='utf-8'?> <service xmlns="http://purl.org/atom/app#" xmlns:atom="http://www.w3.org/2005/Atom"> <workspace> <atom:title>Weblogs</atom:title> <collection href="http://www.example.com/RestfulNews"> <atom:title>RESTful News</atom:title> <categories href="http://www.example.com/categories/RestfulNews" /> </collection> </workspace> <workspace> <atom:title>Photo galleries</atom:title> <collection href="http://www.example.com/samruby/photos" > <atom:title>Sam's photos</atom:title> <accept>image/*</accept> <categories href="http://www.example.com/categories/samruby-photo" /> </collection> <collection href="http://www.example.com/leonardr/photos" > <atom:title>Leonard's photos</atom:title> <accept>image/*</accept> <categories href="http://www.example.com/categories/leonardr-photo" /> </collection> </workspace> </service>
How do I know what kind of POST requests a collection will
accept? From the accept
tags. The
accept
tag works something like
the HTTP Accept
header, only in
reverse. The Accept
header is
usually sent by the client with a GET request, to tell the server
which representation formats the client understands. The accept
tag is the APP server’s way of
telling the client which incoming representations a collection will
accept as part of a POST request that creates a new member.
My two photo gallery collections specify an accept
of image/*
. Those collections will only
accept POST requests where the representation is an image. On the
other hand, the RESTful News weblog doesn’t specify an accept
tag at all. The APP default is to
assume that a collection only accepts POST requests when the
representation is an Atom entry document (like the one in Example 9-5). The accept
tag defines what the collections
are for: the weblog is for textual data, and the photo collections
are for images.
The other important thing about a service document is the
categories
tag, which links to a
“category document” resource. The category document says what
categories are allowed.
The APP doesn’t say much about service documents. It specifies their representation format, and says that they must serve a representation in response to GET. It doesn’t specify how service documents get on the server in the first place. If you write an APP application you can hardcode your service documents in advance, or you can make it possible to create new ones by POSTing to some new resource not covered by the APP. You can expose them as static files, or you can make them respond to PUT and DELETE. It’s up to you.
Tip
As you can see from Example 9-6, a service document’s representation doesn’t just describe collections: it groups collections into workspaces. When I wrote that representation I put the weblog in a workspace of its own, and grouped the photo galleries into a second workspace. The APP standard devotes some time to workspaces, but I’m going to pass over them, because the APP doesn’t define workspaces as resources. They don’t have their own URIs, and they only exist as elements in the representation of a service document. You can expose workspaces as resources if you want. The APP doesn’t prohibit it, but it doesn’t tell you how to do it, either.
Category documents
APP members (which correspond to Atom elements) can be put into categories. In Chapter 7, I represented a bookmark’s tags with Atom categories. The Atom entry described in Example 9-5 put the entry into a category called “local.” Where did that category come from? Who says which categories exist for a given collection? This is the last big question the APP answers.
The Atom entry document in Example 9-5 gave its category a “scheme” of
http://www.example.com/categories/RestfulNews.
The representation of the RESTful News collection, in the service
document, gave that same URI in its categories
tag. That URI points to the
final APP resource: a category document (see Example 9-7). A category document lists
the category vocabulary for a particular APP collection. Its media
type is application/atomcat+xml
.
Example 9-7 shows a representation of the
category document for the collection “RESTful News.” This category
document defines three categories: “local,” “international,” and
“lighterside,” which can be referenced in Atom entry
entities like the one in Example 9-5.
<?xml version="1.0" ?> <app:categories xmlns:app="http://purl.org/atom/app#" xmlns="http://www.w3.org/2005/Atom" scheme="http://www.example.com/categories/RestfulNews" fixed="no"> <category term="local" label="Local news"/> <category term="international" label="International news"/> <category term="lighterside" label="The lighter side of REST"/> </app:categories>
The scheme is not fixed, meaning that it’s OK to publish members to the collection even if they belong to categories not listed in this document. This document might be used in an end-user application to show a selectable list of categories for a new “RESTful news” story.
As with service documents, the APP defines the representation format for a category document, but says nothing about how category documents are created, modified, or destroyed. It only defines GET on the category document resource. Any other operations (like automatically modifying the category document when someone files an entry under a new category) are up to you to define.
Binary documents as APP members
There’s one important wrinkle I’ve glossed over. It has to do with the “photo gallery” collections I described in Example 9-6. I said earlier that a client can create a new member in a photo gallery by POSTing an image file to the collection. But an image file can’t go into an Atom feed: it’s a binary document. What exactly happens when a client POSTs a binary document to an APP collection? What’s in those photo galleries, really?
Remember that a resource can have more than one representation. Each photo I upload to a photo collection has two representations. One representation is the binary photo, and the other is an XML document containing metadata. The XML document is an Atom entry, the same as the news item in Example 9-5, and that’s the data that shows up in the Atom feed.
Here’s an example. I POST a JPEG file to my “photo gallery” collection, like so:
POST /leonardr/photos HTTP/1.1 Host: www.example.com Content-type: image/jpeg Content-length: 62811 Slug: A picture of my guinea pig [JPEG file goes here]
The Slug
is a custom HTTP
header defined by the APP, which lets me specify a title for the
picture while uploading it. The slug can show up in several pieces
of resource state, as you’ll see in a bit.
The HTTP response comes back as I described it in Members” earlier in this chapter. The response code
is 201 and the Location
header
gives me the URI of the newly created APP member.
201 Created Location: http://www.example.com/leonardr/photos/my-guinea-pig.atom
But what’s at the other end of the URI? Not the JPEG file I uploaded, but an Atom entry document describing and linking to that file:
<?xml version="1.0" encoding="utf-8"?> <entry> <title>A picture of my guinea pig</title> <updated>2007-01-24T11:52:29Z</updated> <id>urn:f1ef2e50-8ec8-0129-b1a7-003065546f18</id> <summary></summary> <link rel="edit-media" type="image/jpeg" href="http://www.example.com/leonardr/photos/my-guinea-pig.jpg" /> </entry>
The actual JPEG I uploaded is at the other end of that
link
. I can GET it, of course,
and I can PUT to it to overwrite it with another image. My POST
created a new “member” resource, and my JPEG is a representation of
some of its resource state. But there’s also this other
representation of resource state: the metadata. These other elements
of resource state include:
The title, which I chose (the server decided to use my
Slug
as the title) and can change later.The summary, which starts out blank but I can change.
The “last update” time, which I sort of chose but can’t change arbitrarily.
The URI to the image representation, which the server chose for me based on my
Slug
.The unique ID, which the server chose without consulting me at all.
This metadata document can be included in an Atom feed: I’ll
see it in the representation of the “photo gallery” collection. I
can also modify this document and PUT it back to
http://www.example.com/leonardr/photos/my-guinea-pig.atom to change
the resource state. I can specify myself as the author
, add categories, change the title,
and so on. If I get tired of having this member in the collection, I
can delete it by sending a DELETE request to either of its
URIs.
That’s how the APP handles photos and other binary data as collection members. It splits the representation of the resource into two parts: the binary part that can’t go into an Atom feed and the metadata part that can. This works because the metadata of publishing (categories, summary, and so on) applies to photos and movies just as easily as to news articles and weblog entries.
Tip
If you read the APP standard (which you should, since this section doesn’t cover everything), you’ll see that it describes this behavior in terms of two different resources: a “Media Link Entry,” whose representation is an Atom document, and a “Media Resource,” whose representation is a binary file. I’ve described one resource that has two representations. The difference is purely philosophical and has no effect on the actual HTTP requests and responses.
Summary
That’s a fairly involved workflow, and I haven’t even covered everything that the APP specifies, but the APP is just a well-thought-out way of handling a common web service problem: the list/feed/collection that keeps having items/elements/members added to it. If your problem fits this domain, it’s easier to use the APP design—and get the benefits of existing client support—than to reinvent something similar (see Table 9-1).
GData
I said earlier that the Atom Publishing Protocol defines only a few resources and only a few operations on those resources. It leaves a lot of space open for extension. One extension is Google’s GData, which adds a new kind of resource and some extras like an authorization mechanism. As of the time of writing, the Google properties Blogger, Google Calendar, Google Code Search, and Google Spreadsheets all expose RESTful web service interfaces. In fact, all four expose the same interface: the Atom Publishing Protocol with the GData extensions.
Unless you work for Google, you probably won’t create any services that expose the precise GData interface, but you may encounter GData from the client side. It’s also useful to see how the APP can be extended to handle common cases. See how Google used the APP as a building block, and you’ll see how you can do the same thing.
Querying collections
The biggest change GData makes is to expose a new kind of resource: the list of search results. The APP says what happens when you send a GET request to a collection’s URI. You get a representation of some of the members in the collection. The APP doesn’t say anything about finding specific subsets of the collection: finding members older than a certain date, written by a certain author, or filed under a certain category. It doesn’t specify how to do full-text search of a member’s text fields. GData fills in these blanks.
GData takes every APP collection and exposes an infinite number of additional resources that slice it in various ways. Think back to the “RESTful News” APP collection I showed in Example 9-2. The URI to that collection was http://www.example.com/RestfulNews. If that collection were exposed through a GData interface, rather than just an APP interface, the following URIs would also work:
http://www.example.com/RestfulNews?q=stadium: A subcollection of the members where the content contains the word “stadium.”
http://www.example.com/RestfulNews/-/local: A subcollection of the members categorized as “local.”
http://www.example.com/RestfulNews?author=Tom%20Servo&max-results=50: At most 50 of the members where the author is “Tom Servo.”
Those are just three of the search possibilities GData
exposes. (For a complete list, see the GData
developer’s guide. Note that not all GData applications
implement all query mechanisms.) Search results are usually
represented as Atom feeds. The feed contains a entry
element for every member of the
collection that matched the query. It also contains OpenSearch
elements (q.v.) that specify how many members matched the query, and
how many members fit on a page of search results.
Data extensions
I mentioned earlier that an Atom feed can contain markup from arbitrary other XML namespaces. In fact, I just said that GData search results include elements from the OpenSearch namespace. GData also defines a number of new XML entities in its own “gd” namespace, for representing domain-specific data from the Google web services.
Consider an event in the Google Calendar service. The collection is someone’s calendar and the member is the event itself. This member probably has the typical Atom fields: an author, a summary, a “last updated” date. But it’s also going to have calendar-specific data. When does the event take place? Where will it happen? Is it a one-time event or does it recur?
Google Calendar’s GData API puts this data in its Atom feeds,
using tags like gd:when
, gd:who
, and gd:recurrence
. If the client understands
Google Calendar’s extensions it can act as a calendar client. If it
only understands the APP, it can act as a general APP client. If it
only understands the basic Atom feed format, it can treat the list
of events as an Atom feed.
POST Once Exactly
POST requests are the fly in the ointment that is reliable HTTP. GET, PUT, and DELETE requests can be resent if they didn’t go through the first time, because of the restrictions HTTP places on those methods. GET requests have no serious side effects, and PUT and DELETE have the same effect on resource state whether they’re sent once or many times. But a POST request can do anything at all, and sending a POST request twice will probably have a different effect from sending it once. Of course, if a service committed to accepting only POST requests whose actions were safe or idempotent, it would be easy to make reliable HTTP requests to that service.
POST Once Exactly (POE) is a way of making HTTP POST idempotent, like PUT and DELETE. If a resource supports Post Once Exactly, then it will only respond successfully to POST once over its entire lifetime. All subsequent POST requests will give a response code of 405 (“Method Not Allowed”). A POE resource is a one-off resource exposed for the purpose of handling a single POST request.
Tip
POE was defined by Mark Nottingham in an IETF draft that expired in 2005. I think POE was a little ahead of its time, and if real services start implementing it, there could be another draft.
You can see the original standard at http://www.mnot.net/drafts/draft-nottingham-http-poe-00.txt.
Think of a “weblog” resource that responds to POST by creating a
new weblog entry. How would we change this design so that no resource
responds to POST more than once? Clearly the weblog can’t expose POST
anymore, or there could only ever be one weblog entry. Here’s how POE
does it. The client sends a GET or HEAD request to the “weblog”
resource, and the response includes the special POE
header:
HEAD /weblogs/myweblog HTTP/1.1 Host: www.example.com POE: 1
The response contains the URI to a POE resource that hasn’t yet been POSTed to. This URI is nothing more than a unique ID for a future POST request. It probably doesn’t even exist on the server. Remember that GET is a safe operation, so the original GET request couldn’t have changed any server state.
200 OK POE-Links: /weblogs/myweblog/entry-factory-104a4ed
POE
and POE-Links
are custom HTTP headers defined by
the POE draft. POE
just tells the
server that the client is expecting a link to a POE resource. POE-Links
gives one or more links to POE
resources. At this point the client can POST a representation of its
new weblog entry to /weblogs/myweblog/entry-factory-104a4ed
.
After the POST goes through, that URI will start responding to POST
with a response code of 405 (“Operation Not Supported”). If the client
isn’t sure whether or not the POST request went through, it can safely
resend. There’s no possiblity that the second POST will create a
second weblog entry. POST has been rendered idempotent.
The nice thing about Post Once Exactly is that it works with overloaded POST. Even if you’re using POST in a way that totally violates the Resource-Oriented Architecture, your clients can use HTTP as a reliable protocol if you expose the overloaded POST operations through POE.
An alternative to making POST idempotent is to get rid of POST altogether. Remember, POST is only necessary when the client doesn’t know which URI it should PUT to. POE works by generating a unique ID for each of the client’s POST operations. If you allow clients to generate their own unique IDs, they can use PUT instead. You can get the benefits of POE without exposing POST at all. You just need to make sure that two clients will never generate the same ID.
Hypermedia Technologies
There are two kinds of hypermedia: links and forms. A link is a connection between the current resource and some target resource, identified by its URI. Less formally, a link is any URI found in the body of a representation. Even JSON and plain text are hypermedia formats of a sort, since they can contain URIs in their text. But throughout this book when I say “hypermedia format,” I mean a format with some kind of structured support for links and forms.
There are two kinds of forms. The simplest kind I’ll call application forms, because they show the client how to manipulate application state. An application form is a way of handling resources whose names follow a pattern: it basically acts as a link with more than one destination. A search engine doesn’t link to every search you might possibly make: it gives you a form with a space for you to type in your search query. When you submit the form, your browser constructs a URI from what you typed into the form (say, http://www.google.com/search?q=jellyfish), and makes a GET request to that URI. The application form lets one resource link to an infinite number of others, without requiring an infinitely large representation.
The second kind of form I’ll call resource forms, because they show the client how to format a representation that modifies the state of a resource. GET and DELETE requests don’t need representations, of course, but POST and PUT requests often do. Resource forms say what the client’s POST and PUT representations should look like.
Links and application forms implement what I call connectedness, and what the Fielding thesis calls “hypermedia as the engine of application state.” The client is in charge of the application state, but the server can send links and forms that suggest possible next states. By contrast, a resource form is a guide to changing the resource state, which is ultimately kept on the server.
I cover four hypermedia technologies in this section. As of the time of writing, XHTML 4 is the only hypermedia technology in active use. But this is a time of rapid change, thanks in part to growing awareness of RESTful web services. XHTML 5 is certain to be widely used once it’s finally released. My guess is that URI Templates will also catch on, whether or not they’re incorporated into XHTML 5. WADL may catch on, or it may be supplanted by a combination of XHTML 5 and microformats.
URI Templates
URI Templates (currently an Internet Draft) are a technology that makes simple resource forms look like links. I’ve used URI Template syntax whenever I want to show you an infinite variety of similar URIs. There was this example from Chapter 3, when I was showing you the resources exposed by Amazon’s S3 service:
https://s3.amazonaws.com/{name-of-bucket}
/{name-of-object}
That string is not a valid URI, because curly brackets aren’t
valid in URIs, but it is a valid URI Template. The substring
{name-of-bucket}
is a blank to be filled
in, a placeholder to be replaced with the value of the variable
name-of-bucket
. There are an
infinite number of URIs lurking in that one template, including https://s3.amazonaws.com/bucket1/object1, https://s3.amazonaws.com/my-other-bucket/subdir/SomeObject.avi,
and so on.
URI templating gives us a precise way to play fill-in-the-blanks
with URIs. Without URI Templates, a client must rely on preprogrammed
URI construction rules based on English descriptions like “https://s3.amazonaws.com/
, and then the
bucket name.”
URI Templates are not a data format, but any data format can improve its hypermedia capabilities by allowing them. There is currently a proposal to support URI Templates in XHTML 5, and WADL supports them already.
XHTML 4
HTML is the most successful hypermedia format of all time, but its success on the human web has typecast it as sloppy, and sent practitioners running for the more structured XML. The compromise standard is XHTML, an XML vocabulary for describing documents which uses the same tags and attributes found in HTML. Since it’s basically the same as HTML, XHTML has a powerful set of hypermedia features, though its forms are somewhat anemic.
XHTML 4 links
A number of HTML tags can be used to make hypertext links (consider img
, for example), but the two main ones
are link
and a
. A link
tag shows up in the document’s
head
, and connects the document
to some resource. The link
tag
contains no text or other tags: it applies to the entire document.
An a
tag shows up in the
document’s body
. It can contain
text and other tags, and it links its contents (not the document as
a whole) to another resource (see Example 9-8).
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head> <link rel="alternate" type="application/atom+xml" href="atom.xml"> <link rel="stylesheet" href="display.css"> </head> <body> <p> Have you read <a href="Great-Expectations.html"><i>Great Expectations</i></a>? </p> </body> </html>
Example 9-8 shows a simple HTML document
that contains both sorts of hyperlinks. There are two links that
use link
to relate the
document as a whole to other URIs, and there’s one link that uses
a
to relate part of the document
(the italicized phrase “Great Expectations”) to another URI.
The three important attributes of link
and a
tags are href
, rel
, and rev
. The href
attribute is the most important: it
gives the URI of the resource that’s being linked to. If you don’t
have an href
attribute, you don’t
have a hyperlink.
The rel
attribute adds
semantics that explain the foreign URI’s relationship to this
document. I mentioned this attribute earlier when I was talking
about microformats. In Example 9-8, the
relationship of the URI atom.xml
to this document is “alternate”. The relationship of the URI
display.css
to this document is
“stylesheet”. These particular values for rel
are among the 15 defined in the HTML 4
standard. The value “alternate” means that the linked URI is an
alternate representation of the resource this document represents.
The value “stylesheet” means that the linked URI contains
instructions on how to format this document for display.
Microformats often define additional values for rel
. The rel-nofollow
microformat defines the
relationship “nofollow”, to show that a document doesn’t trust the
resource it’s linking to.
The rev
attribute is the
exact opposite of rel
: it
explains the relationship of this document to the foreign URI. The
VoteLinks microformat lets you express your opinion of a URI by
setting rev
to “vote-for” or
“vote-against”. In this case, the foreign URI probably has no
relationship to you, but you have a relationship to it.
A simple example illustrates the difference between rel
and rev
. Here’s an HTML snippet of a user’s
home page, which contains two links to his father’s home
page.
<a rel="parent" href="/Dad">My father</a> <a rev="child" href="/Dad">My father</a>
XHTML 4 forms
These are the forms that drive the human web. You might not have known
about the rel
and rev
attributes, but if you’ve done any web
programming, you should be familiar with the hypermedia capabilities
of XHTML forms.
To recap what you might already know: HTML forms are described
with the form
tag. A form
tag has a method
attribute, which names the HTTP
method the client should use when submitting the form. It has an
action
attribute, which gives the
(base) URI of the resource the form is accessing. It also has an
enctype
attribute, which gives
the media type of any representation the client is supposed to send
along with the request.
A form
tag can contain form
elements: children like input
and
select
tags. These show up in web
browsers as GUI elements: text inputs, checkboxes, buttons, and the
like. In application forms, the values entered into the form
elements are used to construct the ultimate destination of a GET
request. Here’s an application form I just made up: an interface to
a search engine.
<form method="GET" action="http://search.example.com/search"> <input name="q" type="text"> <input type="submit" /> </form>
Since this is an application form, it’s not designed to
operate on any particular resource. The point of the form is to use
the URI in the action
as a
jumping-off point to an infinity of resources with user-generated
URIs:
http://search.example.com/search?q=jellyfish,
http://search.example.com/search?q=chocolate,
and so on.
A resource form in HTML 4 identifies one particular resource, and it specifies an action of POST. The form elements are used to build up a representation to be sent along with the POST request. Here’s a resource form I just made up: an interface to a file upload script.
<form method="POST" action="http://files.example.com/dir/subdir/" enctype="multipart/form-data"> <input type="text" name="description" /> <input type="file" name="newfile" /> </form>
This form is designed to manipulate resource state, to create a new “file” resource as a subordinate resource of the “directory” resource at http://files.example.com/dir/subdir/. The representation format is a “multipart/form-data” document that contains a textual description and a (possibly binary) file.
Shortcomings of XHTML 4
HTML 4’s hypermedia features are obviously good enough to give us the human web we enjoy today, but they’re not good enough for web services. I have five major problems with HTML’s forms.
Application forms are limited in the URIs they can express. You’re limited to URIs that take a base URI and then tack on some key-value pairs. With an HTML application form you can “link” to http://search.example.com/search?q=jellyfish, but not http://search.example.com/search/jellyfish. The variables must go into the URI’s query string as key-value pairs.
Resource forms in HTML 4 are limited to using HTTP POST. There’s no way to use a form to tell a client to send a DELETE request, or to show a client what the representation of a PUT request should look like. The human web, which runs on HTML forms, has a different uniform interface from web services as a whole. It uses GET for safe operations, and overloaded POST for everything else. If you want to get HTTP’s uniform interface with HTML 4 forms, you’ll need to simulate PUT and DELETE with overloaded POST (see Faking PUT and DELETE” in Chapter 8 for the standard way).
There’s no way to use an HTML form to describe the HTTP headers a client should send along with its request. You can’t define a form entity and say “the value of this entity goes into the HTTP request header
X-My-Header
.” I generally don’t think services should require this of their clients, but sometimes it’s necessary. The Atom Publishing Protocol defines a special request header (Slug
, mentioned above) for POST requests that create a new member in a collection. The APP designers defined a new header, instead of requiring that this data go into the entity-body, because the entity-body might be a binary file.You can’t use an HTML form to specify a representation more complicated than a set of key-value pairs. All the form elements are designed to be turned into key-value pairs, except for the “file” element, which doesn’t help much. The HTML standard defines two content types for form representations:
application/x-www-form-urlencoded
, which is for key-value pairs (I covered it in Form-encoding” in Chapter 6); andmultipart/form-data
, which is for a combination of key-value pairs and uploaded files.You can specify any content type you want in
enctype
, just as you can put anything you want in a tag’sclass
andrel
attributes. So you can tell the client it should POST an XML file by setting a form’senctype
toapplication/xml
. But there’s no way of conveying what should go into that XML file, unless it happens to be an XML representation of a bunch of key-value pairs. You can’t nest form elements, or define new ones that represent data structures more complex than key-value pairs. (You can do a little better if the XML vocabulary you’re using has its own media type, likeapplication/atom+xml
orapplication/rdf+xml
.)As I mentioned in Link the Resources to Each Other” in Chapter 5, you can’t define a repeating field in an HTML form. You can define the same field twice, or ten times, but eventually you’ll have to stop. There’s no way to tell the client: “you can specify as many values as you want for this key-value pair.”
XHTML 5
HTML 5 solves many of the problems that turn up when you try to use HTML on the programmable web. The main problem with HTML 5 is the timetable. The official estimate has HTML 5 being adopted as a W3C Proposed Recommendation in late 2008. More conservative estimates push that date all the way to 2022. Either way, HTML 5 won’t be a standard by the time this book is published. That’s not really the issue, though. The issue is when real clients will start supporting the HTML 5 features I describe below. Until they do, if you use the features of HTML 5, your clients will have to write custom code to interpret them.
HTML 5 forms support all four basic methods of HTTP’s uniform interface: GET, POST, PUT, and DELETE. I took advantage of this when designing my map application, if you’ll recall Example 6-3. This is the easiest HTML 5 feature to support today, especially since (as I’ll show in Chapter 11) most web browsers can already make PUT and DELETE requests.
There’s a proposal (not yet incorporated into HTML 5; see http://blog.welldesignedurls.org/2007/01/11/proposing-uri-templates-for-webforms-2/)
that would allow forms to use URI Templates. Under this proposal, an
application form can have its template
attribute (not
its action
attribute) be a URI
Template like http://search.example.com/search/{q}
. It
could then define q
as a text field
within the form. This would let you use an application form to “link”
to http://search.example.com/search/jellyfish
.
HTML 4 forms can specify more than one form element with the same name. This lets clients know they can submit the same key with 2 or 10 values: as many values as there are form elements. HTML 5 forms support the “repetition model,” a way of telling the client it’s allowed to submit the same key as many times as it wants. I used a simple repetition block in Example 5-11.
Finally, HTML 5 defines two new ways of serializing key-value
pairs into representations: as plain text, or using a newly defined
XML vocabulary. The content type for the latter is application/x-www-form+xml
. This is not as
big an advance as you might think. Form entities like input
are still ways of getting data in the
form of key-value pairs. These new serialization formats are just new
ways of representing those key-value pairs. There’s still no way to
show the client how to format a more complicated representation,
unless the client can figure out the format from just the content
type.
WADL
The Web Application Description Language is an XML vocabulary for expressing the behavior of HTTP resources (see the development site for the Java client). It was named by analogy with the Web Service Description Language, a different XML vocabulary used to describe the SOAP-based RPC-style services that characterize Big Web Services.
Look back to Service document” earlier in this chapter where I describe the Atom Publishing Protocol’s service documents. The representation of a service document is an XML document, written in a certain vocabulary, which describes a set of resources (APP collections) and the operations you’re allowed to perform on those resources. WADL is a standard vocabulary that can do for any resource at all what APP service documents do for APP collection resources.
You can provide a WADL file that describes every resource exposed by your service. This corresponds roughly to a WSDL file in a SOAP/WSDL service, and to the “site map” pages you see on the human web. Alternatively, you can embed a snippet of WADL in an XML representation of a particular resource, the way you might embed an HTML form in an HTML representation. The WADL snippet tells you how to manipulate the state of the resource.
As I said way back in Chapter 2, WADL makes it easy to write clients for web services. A WADL description of a resource can stand in for any number of programming-language interfaces to that resource: all you need is a WADL client written in the appropriate language. WADL abstracts away the details of HTTP requests, and the building and parsing of representations, without hiding HTTP’s uniform interface.
As of the time of writing, WADL is more talked about than used. There’s a Java client implementation, a rudimentary Ruby client, and that’s about it. Most existing WADL files are bootleg descriptions of other peoples’ RESTful and REST-RPC services.
WADL does better than HTML 5 as a hypermedia format. It supports URI Templates and every HTTP method there is. A WADL file can also tell the client to populate certain HTTP headers when it makes a request. More importantly, WADL can describe representation formats that aren’t just key-value pairs. You can specify the format of an XML representation by pointing to a schema definition. Then you can point out which parts of the document are most important by specifying key-value pairs where the “keys” are XPath statements. This is a small step, but an important one. With HTML you can only specify the format of an XML representation by giving it a different content type.
Of course, the “small step” only applies to XML. You can use WADL to say that a certain resource serves or accepts a JSON document, but unless that JSON document happens to be a hash (key-value pairs again!), there’s no way to specify what the JSON document ought to look like. This is a general problem which was solved in the XML world with schema definitions. It hasn’t been solved for other formats.
Describing a del.icio.us resource
Example 9-9 shows a Ruby client for the del.icio.us web service based on Ruby’s WADL library. It’s a reprint of the code from Clients Made Easy with WADL” in Chapter 2.
#!/usr/bin/ruby # delicious-wadl-ruby.rb require 'wadl' if ARGV.size != 2 puts "Usage: #{$0} [username] [password]" exit end username, password = ARGV # Load an application from the WADL file delicious = WADL::Application.from_wadl(open("delicious.wadl")) # Give authentication information to the application service = delicious.v1.with_basic_auth(username, password) begin # Find the "recent posts" functionality recent_posts = service.posts.recent # For every recent post... recent_posts.get.representation.each_by_param('post') do |post| # Print its description and URI. puts "#{post.attributes['description']}: #{post.attributes['href']}" end rescue WADL::Faults::AuthorizationRequired puts "Invalid authentication information!" end
The code’s very short but you can see what’s happening,
especially now that we’re past Chapter 2 and
I’ve shown you how resource-oriented services work. The del.icio.us
web service exposes a resource that the WADL library identifies with
v1
. That resource has a
subresource identified by posts.recent
. If you recall the inner
workings of del.icio.us from Chapter 2, you’ll
recognize this as corresponding to the URI https://api.del.icio.us/v1/posts/recent. When you
tell the WADL library to make a GET request to that resource, you
get back some kind of response object which includes an XML representation
. Certain parts of this
representation, the post
s, are
especially interesting, and I process them as XML elements,
extracting their description
s and
href
s.
Let’s look at the WADL file that makes this code possible.
I’ve split it into three sections: resource definition, method
definition, and representation definition. Example 9-10 shows the resource definition. I’ve
defined a nested set of WADL resources: recent
inside posts
inside v1
. The
recent
WADL resource corresponds
to the HTTP resource the del.icio.us API exposes at https://api.del.icio.us/v1/posts/recent.
<?xml version="1.0"?> <!-- This is a partial bootleg WADL file for the del.icio.us API. --> <application xmlns="http://research.sun.com/wadl/2006/07"> <!-- The resource --> <resources base="https://api.del.icio.us/"> <doc xml:lang="en" title="The del.icio.us API v1"> Post or retrieve your bookmarks from the social networking website. Limit requests to one per second. </doc> <resource path="v1"> <param name="Authorization" style="header" required="true"> <doc xml:lang="en">All del.icio.us API calls must be authenticated using Basic HTTP auth.</doc> </param> <resource path="posts"> <resource path="recent"> <method href="#getRecentPosts" /> </resource> </resource> </resource> </resources>
That HTTP resource exposes a single method of the uniform
interface (GET), so I define a single WADL method inside the WADL
resource. Rather than define the method inside the resource
tag and clutter up Example 9-10, I’ve defined it by reference. I’ll
get to it next.
Every del.icio.us API request must include an Authorization
header that encodes your
del.icio.us username and password using HTTP Basic Auth. I’ve
represented this with a param
tag
that tells the client it must provide an Authorization
header. The param
tag is the equivalent of an HTML
form element: it tells the client about a blank to be filled
in.[31]
Example 9-11 shows the definition of
the method getRecentPosts
. A WADL
method corresponds to a request you might make using HTTP’s uniform
interface. The id
of the method
can be anything, but its name
is
always the name of an HTTP method: here, “GET”. The method
definition models both the HTTP request
and response
.
<!-- The method --> <method id="getRecentPosts" name="GET"> <doc xml:lang="en" title="Returns a list of the most recent posts." /> <request> <param name="tag" style="form"> <doc xml:lang="en" title="Filter by this tag." /> </param> <param name="count" style="form" default="15"> <doc xml:lang="en" title="Number of items to retrieve."> Maximum: 100 </doc> </param> </request> <response> <representation href="#postList" /> <fault id="AuthorizationRequired" status="401" /> </response> </method>
This particular request
defines two more param
s: two more
blanks to be filled in by the client. These are “query” param
s, which in a GET request means
they’ll be tacked onto the query string—just like elements in an
HTML form would be. These param
definitions make it possible for the WADL client to access URIs like
https://api.del.icio.us/v1/posts/recent?count=100 and
https://api.del.icio.us/v1/posts/recent?tag=rest&count=20.
This WADL method
defines an
application form: not a way of manipulating resource state, but a
pointer to possible new application states. This method
tag tells the client about an
infinite number of GET requests they can make
to a set of related resources, without having to list infinitely
many URIs. If this method corresponded to a PUT or POST request, its
request
might be a resource form,
a way of manipulating resource state. Then it might describe a
representation
for you to send
along with your request.
The response
does describe
a representation
: the response
document you get back from del.icio.us when you make one of these
GET requests. It also describes a possible fault condition: if you
submit a bad Authorization
header, you’ll get a response code of 401 (“Unauthorized”) instead
of a representation.
Take a look at Example 9-12, which defines the representation. This is WADL’s description of the XML document you receive when you GET https://api.del.icio.us/v1/posts/recent: a document like the one in Example 2-3.
<!-- The representation --> <representation id="postList" mediaType="text/xml" element="posts"> <param name="post" path="/posts/post" repeating="true" /> </representation> </application>
The WADL description gives the most important points about
this document: its content type is text/xml
, and it’s rooted at the posts
tag. The param
tag points out that the the posts
tag has a number of interesting
children: the post
tags. The
param
’s path
attribute gives an XPath expression
which the client can use on the XML document to fetch all the
del.icio.us posts. My client’s call to each_by_param('post')
runs that XPath
expression against the document, and lets me operate on each
matching element without having to know anything about XPath or the
structure of the representation.
There’s no schema definition for this kind of XML
representation: it’s a very simple document and del.icio.us just
assumes you can figure out the format. But for the sake of
demonstration, let’s pretend this representation has an XML Schema
Definition (XSD) file. The URI of this imaginary definition is
https://api.del.icio.us/v1/posts.xsd, and it
defines the schema for the posts
and post
tags. In that fantasy
situation, Example 9-13 shows how I
might define the representation
in terms of the schema file.
<?xml version="1.0"?> <!-- This is a partial bootleg WADL file for the del.icio.us API. --> <application xmlns="http://research.sun.com/wadl/2006/07" xmlns:delicious="https://api.del.icio.us/v1/posts.xsd"> <grammars> <include "https://api.del.icio.us/v1/posts.xsd" /> </grammars> ... <representation id="postList" mediaType="text/xml" element="delicious:posts" /> ... </application>
I no longer need a param
to
say that this document is full of post
tags. That information’s in the XSD
file. I just have to define the representation in terms of that
file. I do this by referencing the XSD file in this WADL file’s
grammars
, assigning it to the
delicious:
namespace, and scoping
the representation’s element
attribute to that namespace. If the client is curious about what a
delicious:posts
tag might
contain, it can check the XSD. Even though the XSD completely
describes the representation format, I might define some param
tags anyway to point out especially
important parts of the document.
Describing an APP collection
That was a pretty simple example. I used an application form to
describe an infinite set of related resources, each of which
responds to GET by sending a simple XML document. But I can use WADL
to describe the behavior of any resource that responds to the
uniform interface. If a resource serves an XML representation, I can
reach into that representation with param
tags: show where the interesting
bits of data are, and where the links to other resources can be
found.
Earlier I compared WADL files to the Atom Publishing Protocol’s service documents. Both are XML vocabularies for describing resources. Service documents describe APP collections, and WADL documents describe any resource at all. You’ve seen how a service document describes a collection (Example 9-6). What would a WADL description of the same resources look like?
As it happens, the WADL standard gives just this example. Section A.2 of the standard shows an APP service document and then a WADL description of the same resources. I’ll present a simplified version of this idea here.
The service document in Example 9-6 describes three Atom collections. One accepts new Atom entries via POST, and the other two accept image files. These collections are pretty similar. In an object-oriented system I might factor out the differences by defining a class hierarchy. I can do something similar in WADL. Instead of defining all three resources from scratch, I’m going to define two resource types. Then it’ll be simple to define individual resources in terms of the types (see Example 9-14).
<?xml version="1.0"?> <!-- This is a description of two common types of resources that respond to the Atom Publishing Protocol. --> <application xmlns="http://research.sun.com/wadl/2006/07" xmlns:app="http://purl.org/atom/app"> <!-- An Atom collection accepts Atom entries via POST. --> <resource_type id="atom_collection"> <method href="#getCollection" /> <method href="#postNewAtomMember" /> </resource_type> <!-- An image collection accepts image files via POST. --> <resource_type id="image_collection"> <method href="#getCollection" /> <method href="#postNewImageMember" /> </resource_type>
There are my two resource types: the Atom collection and the
image collection. These don’t correspond to any specific resources:
they’re equivalent to classes in an object-oriented design. Both
“classes” support a method identified as getCollection
, but the Atom collection
supports a method postNewAtomMember
where the image
collection supports postNewImageMember
. Example 9-15 shows those three methods:
<!-- Three possible operations on resources. --> <method name="GET" id="getCollection"> <response> <representation href="#feed" /> </response> </method> <method name="POST" id="postNewAtomMember"> <request> <representation href="#entry" /> </request> </method> <method name="POST" id="postNewImageMember"> <request> <representation id="image" mediaType="image/*" /> <param name="Slug" style="header" /> </request> </method>
The getCollection
WADL
method is revealed as a GET operation that expects an Atom feed (to
be described) as its representation. The postNewAtomMember
method is a POST operation that sends an Atom entry (again,
to be described) as its representation. The postNewImageMember
method is also a POST
operation, but the representation it sends is an image file, and it
knows how to specify a value for the HTTP header Slug
.
Finally, Example 9-16 describes the two
representations: Atom feeds and atom entries. I don’t need to
describe these representations in great detail because they’re
already described in the XML Schema Document for Atom: I can just
reference the XSD file. But I’m free to annotate the XSD by defining
param
elements that tell a WADL
client about the links between resources.
<!-- Two possible XML representations. --> <representation id="feed" mediaType="application/atom+xml" element="atom:feed" /> <representation id="entry" mediaType="application/atom+xml" element="atom:entry" /> </application>
I can make the file I just defined available on the Web: say,
at http://www.example.com/app-resource-types.wadl
.
Now it’s a resource. I can use it in my services by referencing its
URI. So can anyone else. It’s now possible to define certain APP
collections in terms of these resource types. My three collections
are defined in just a few lines in Example 9-17.
<?xml version="1.0"?> <!-- This is a description of three "collection" resources that respond to the Atom Publishing Protocol. --> <application xmlns="http://research.sun.com/wadl/2006/07" xmlns:app="http://purl.org/atom/app"> <resources base="http://www.example.com/"> <resource path="RESTfulNews" type="http://www.example.com/app-resource-types.wadl#atom_collection" /> <resource path="samruby/photos" type="http://www.example.com/app-resource-types.wadl#image_collection" /> <resource path="leonardr/photos" type="http://www.example.com/app-resource-types.wadl#image_collection"/> </resources> </application>
The Atom Publishing Protocol is popular because it’s such a general interface. The major differences between two APP services are described in the respective service documents. A generic APP client can read these documents and reprogram itself to act as a client for many different services. But there’s an even more general interface: the uniform interface of HTTP. An APP service document uses a domain-specific XML vocabulary, but hypermedia formats like HTML and WADL can be used to describe any web service at all. Their clients can be even more general than APP clients.
Hypermedia is how one service communicates the ways it differs from other services. If that intelligence is embedded in hypermedia, the programmer needs to hardwire less of it in code. More importantly, hypermedia gives you access to the link: the second most important web technology after the URI. The potential of REST will not be fully exploited until web services start serving their representations as link-rich hypermedia instead of plain media.
Is WADL evil?
In Chapter 10 I’ll talk about how WSDL turned SOAP from a simple XML envelope format to a name synonymous with the RPC style of web services. WSDL abstracts away the details of HTTP requests and responses, and replaces them with a model based on method calls in a programming language. Doesn’t WADL do the exact same thing? Should we worry that WADL will do to plain-HTTP web services what WSDL did to SOAP web services: tie them to the RPC style in the name of client convenience?
I think we’re safe. WADL abstracts away the details of HTTP requests and responses, but—this is the key point—it doesn’t add any new abstraction on top. Remember, REST isn’t tied to HTTP. When you abstract HTTP away from a RESTful service, you’ve still got REST. A resource-oriented web service exposes resources that respond to a uniform interface: that’s REST. A WADL document describes resources that respond to a uniform interface: that’s REST. A program that uses WADL creates objects that correspond to resources, and accesses them with method calls that embody a uniform interface: that’s REST. RESTfulness doesn’t live in the protocol. It lives in the interface.
About the worst you can do with WADL is hide the fact that a service responds to the uniform interface. I’ve deliberately not shown you how to do this, but you should be able to figure it out. You may need to do this if you’re writing a WADL file for a web application or REST-RPC hybrid service that doesn’t respect the uniform interface.
I’m fairly sure that WADL itself won’t tie HTTP to an RPC model, the way WSDL did to SOAP. But what about those push-button code generators, the ones that take your procedure-call-oriented code and turn it into a “web service” that only exposes one URI? WADL makes you define your resources, but what if tomorrow’s generator creates a WADL file that only exposes a single “resource”, the way an autogenerated WSDL file exposes a single “endpoint”?
This is a real worry. Fortunately, WADL’s history is different from WSDL’s. WSDL was introduced at a time when SOAP was still officially associated with the RPC style. But WADL is being introduced as people are becoming aware of the advantages of REST, and it’s marketed as a way to hide the details while keeping the RESTful interface. Hopefully, any tool developers who want to make their tools support WADL will also be interested in making their tools support RESTful design.
[28] OpenSearch also defines a simple control flow: a special kind of resource called a “description document.” I’m not covering OpenSearch description documents in this book, mainly for space reasons.
[29] This is specified, and argued for, in RFC 3023.
[30] Again, according to RFC 3023, which few developers have read. For a lucid explanation of these problems, see Mark Pilgrim’s article “XML on the Web Has Failed” (http://www.xml.com/pub/a/2004/07/21/dive.html).
[31] Marc Hadley, the primary author of the WADL standard, is working on more elegant ways of representing the need to authenticate.
Get RESTful Web Services now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.