Appendix C. The HTTP Header Top Infinity
There are already two excellent guides to the standard HTTP headers. One’s in the HTTP standard itself, and the other’s in print, in Appendix C of HTTP: The Definitive Guide by Brian Totty and David Gourley (O’Reilly). In this description I’m giving a somewhat perfunctory description of the standard HTTP headers. For each header, I’ll say whether it’s found in HTTP requests, responses, or both. I’ll give my opinion as to how useful the header is when building resource-oriented web services, as opposed to other HTTP-based software like web applications and HTTP proxies. I’ll give a short description of the header, which will get a little longer for tricky or especially important headers. I won’t go into detail on what the header values should look like. I figure you’re smart and you can look up more detailed information as needed.
In Chapter 1 I compared an HTTP request or
response to an envelope that contains a document (an entity-body). I
compared HTTP headers to informational stickers on the envelope. It’s
considered very bad form to come up with your own HTTP methods or response
codes, but it’s fine to come up with your own stickers. After covering the
standard HTTP headers I’ll mention a few custom headers that have become
de facto parts of HTTP, like Cookie
; or
that are used in important technologies, like WSSE’s X-WSSE
and the Atom Publishing Protocol’s
Slug
.
Custom headers are the most common way of extending HTTP. So long as client and server agree on what the headers mean, you can send any information you like along with a request or response. The guidelines are: don’t reinvent an existing header, don’t put things in headers that belong in the entity-body, and follow the naming convention. The names of custom headers should start with the string “X-,” meaning “extension.” The convention makes it clear that your headers are extension headers, and avoids any conflict with future official HTTP headers.
Amazon’s S3, covered in Chapter 3, is a good
example of a service that defines custom headers. Not only does it define
headers like X-amz-acl
and X-amz-date
, it specifies that S3 clients can
send any header whose name begins with “X-amz-meta-.” The header name and
value are associated with an object as a key-value pair, letting you store
arbitrary metadata with your buckets and objects. This is a naming
convention inside a naming convention.
Standard Headers
These are the 46 headers listed in the HTTP standard.
Accept
Type: Request header.
Importance: Medium.
The client sends an Accept
header to
tell the server what data formats it would prefer the server use in
its representations. One client might want a JSON representation;
another might want an RDF representation of the same data.
Hiding this information inside the HTTP headers is a good idea
for web browsers, but it shouldn’t be the only solution for web
service clients. I recommend exposing different representations using
different URIs. This doesn’t mean you have to impose crude rules like
appending .html to the URI for an
HTML representation (though that’s what Rails does). But I think the
information should be in the URI somehow. If you want to support
Accept
on top of this, that’s great
(Rails does this too).
Accept-Charset
Type: Request header.
Importance: Low.
The client sends an Accept-Charset
header to tell the server
what character set it would like the server to use in its
representations. One client might want the representation of a
resource containing Japanese text to be encoded in UTF-8; another
might want a Shift-JIS encoding of the same data.
As I said in Chapter 8, your headaches will be fewer if you pick a Unicode encoding (either UTF-8 or UTF-16) and stick with it. Any modern client should be able to handle these encodings.
Accept-Encoding
Type: Request header.
Importance: Medium to high.
The client sends an Accept-Encoding
header to tell the server that it can save some bandwidth by
compressing the response entity-body with a well-known algorithm like
compress or gzip. Despite the name, this has nothing to do with
character set encoding; that’s Accept-Charset
.
Technically, Accept-Encoding
could be used to apply some other kind of transform to the
entity-body: applying rot13 encryption to all of its text, maybe. In
practice, it’s only used to compress data.
Accept-Language
Type: Request header.
Importance: Low.
The client sends an Accept-Language
header to tell the server what human language it would like the server
to use in its representations. For an example, see Chapter 4 and its discussion of a press release that’s
available in both English and Spanish.
As with media types, I think that a web service should expose
different-language representations of a given resource with different
URIs. Supporting Accept-Language
on
top of this is a bonus.
Accept-Ranges
Type: Response header.
Importance: Low to medium.
The server sends this header to indicate that it supports partial HTTP
GET (see Chapter 8) for the requested URI. A
client can make a HEAD request to a URI, parse the value of this
response header, and then send a GET request to the same URI,
providing an appropriate Range
header.
Age
Type: Response header.
Importance: Low.
If the response entity-body does not come fresh from the server, the
Age
header is a measure of how long
ago it left the server. This header is usually set by HTTP caches, so
that the client knows it might be getting an old copy of a
representation.
Allow
Type: Response header.
Importance: Potentially high, currently low.
I discuss this header in HEAD and OPTIONS”, in Chapter 4. It’s sent in response to an OPTIONS request and tells the client which subset of the uniform interface a particular URI exposes. This header will become much more important if people ever start using OPTIONS.
Authorization
Type: Request header.
Importance: Very high.
This request header contains authorization credentials, such as a username and password, which the client has encoded according to some agreed-upon scheme. The server decodes the credentials and decides whether or not to carry out the request.
In theory, this is the only authorization header anyone should
ever need (except for Proxy-Authorization
, which works on a
different level), because it’s extensible. The most common schemes are
HTTP Basic and HTTP Digest, but the scheme can be anything, so long as
both client and server understand it. In practice, HTTP itself has
been extended, with unofficial request headers like X-WSSE
that work on top of Authorization
. See the X-WSSE
entry below for the reason
why.
Cache-Control
Type: Request and response header.
Importance: Medium.
This header contains a directive to any caches between the client and the server (including any caches on the client or server themselves). It spells out the rules for how the data should be cached and when it should be dumped. I cover some simple caching rules and recipes in Caching” in Chapter 8.
Connection
Type: Response header.
Importance: Low.
Most of an HTTP response is a communication from the server to the client.
Intermediaries like proxies can look at the response, but nothing in
there is aimed at them. But a server can insert extra headers that are
aimed at a proxy, and one proxy can insert headers that are aimed at
the next proxy in a chain. When this happens, the special headers are
named in the Connection
header.
These headers apply to the TCP connection between one machine and
another, not to the HTTP connection between server and client. Before
passing on the response, the proxy is supposed to remove the special
headers and the Connection
header
itself. Of course, it may add its own special communications, and a
new Connection
header, if it
wants.
Here’s a quick example, since this isn’t terribly relevant to this book. The server might send these three HTTP headers in a response that goes through a proxy:
Content-Type: text/plain X-Proxy-Directive: Deliver this as fast as you can! Connection: X-Proxy-Directive
The proxy would remove X-Proxy-Directive
and Connection
, and send the one remaining
header to the client:
Content-Type: text/plain
If you’re writing a client and not using proxies, the only value
you’re likely to see for Connection
is “close.” That just says that the server will close the TCP
connection after completing this request, which is probably what you
expected anyway.
Content-Encoding
Type: Response header.
Importance: Medium to high.
This response header is the counterpart to the request header Accept-Encoding
. The request header asks the
server to compress the entity-body using a certain algorithm. This
header tells the client which algorithm, if any, the server actually
used.
Content-Language
Type: Response header.
Importance: Medium.
This response header is the counterpart to the Accept-Language
request header, or to a
corresponding variable set in a resource’s URI. It specifies the
natural language a human must understand to get meaning out of the
entity-body.
There may be multiple languages listed here. If the entity-body
is a movie in Mandarin with Japanese subtitles, the value for Content-Language
might be “zh-guoyu,jp.” If
one English phrase shows up in the movie, “en” would probably
not show up in the Content-Language
header.
Content-Length
Type: Response header.
Importance: High.
This response header gives the size of the entity-body in bytes. This is important for two
reasons: first, a client can read this and prepare for a small
entity-body or a large one. Second, a client can make a HEAD request
to find out how large the entity-body is, without actually requesting
it. The value of Content-Length
might affect the client’s decision to fetch the entire entity-body,
fetch part of it with Range
, or not
fetch it at all.
Content-Location
Type: Response header.
Importance: Low.
This header tells the client the canonical URI of the resource it requested. Unlike
with the value of the Location
header, this is purely informative. The client is not expected to
start using the new URI.
This is mainly useful for services that assign different URIs to
different representations of the same resource. If the client wants to
link to the specific representation obtained through content
negotiation, it can use the URI given in Content-Location
. So if you request
/releases/104
, and use the
Accept
and Accept-Language
headers to specify an HTML representation written in English, you
might get back a response that specifies
/releases/104.html.en
as the value for
Content-Location
.
Content-MD5
Type: Response header.
Importance: Low to medium.
This is a cryptographic checksum of the entity-body. The client can use this to check whether
or not the entity-body was corrupted in transit. An attacker (such as
a man-in-the-middle) can change the entity-body and change the
Content-MD5
header to match, so
it’s no good for security, just error detection.
Content-Range
Type: Response header.
Importance: Low to medium.
When the client makes a partial GET request with the Range
request header, this response header
says what part of the representation the client is getting.
Content-Type
Type: Response header.
Importance: Very high.
Definitely the most famous response header. This header tells the client what kind of thing the entity-body is. On the human web, a web browser uses this to decide if it can display the entity-body inline, and which external program it must run if not. On the programmable web, a web service client usually uses this to decide which parser to apply to the entity-body.
Date
Type: Request and response header.
Importance: High for request, required for response.
As a request header, this represents the time on the client at the time the
request was sent. As a response header, it represents the time on the
server at the time the request was fulfilled. As a response header,
Date
is used by caches.
ETag
Type: Response header.
Importance: Very high.
The value of ETag
is an
opaque string designating a specific version of a
representation. Whenever the representation changes, the ETag
should also change.
Whenever possible, this header ought to be sent in response to
GET requests. Clients can use the value of ETag
in future conditional GET requests, as
the value of If-None-Match
. If the
representation hasn’t changed, the ETag hasn’t changed either, and the
server can save time and bandwidth by not sending the representation
again.
The main driver of conditional GET requests is the simpler
Last-Modified
response header, and
its request counterpart If-Modified-Since
. The main purpose of
ETag
is to provide a second line of
defense. If a representation changes twice in one second, it will take
on only one value for Last-Modified-Since
, but two different
values for ETag
.
Expect
Type: Request header.
Importance: Medium, but rarely used (as of time of writing).
This header is used to signal a LBYL request (covered in Chapter 8). The server will send the response code 100 (“Continue”) if the client should “leap” ahead and make the real request. It will send the response code 417 (“Expectation Failed”) if the client should not “leap.”
Expires
Type: Response header.
Importance: Medium.
This header tells the client, or a proxy between the server and client, that it may
cache the response (not just the entity-body!) until a certain time.
Even a conditional HTTP GET makes an HTTP connection and takes time
and resources. By paying attention to Expires
, a client can avoid the need to make
any HTTP requests at all—at least for a while. I cover caching briefly
in Chapter 8.
The client should take the value of Expires
as a rough guide, not as a promise
that the entity-body won’t change until that time.
From
Type: Request header.
Importance: Very low.
This header works just like the From
header in an email message. It gives an email address associated with
the person making the request. This is never used on the human web
because of privacy concerns, and it’s used even less on the
programmable web, where the clients aren’t under the control of human
beings. You might want to use it as an extension to User-Agent
.
Host
Type: Request header.
Importance: Required.
This header contains the domain name part of the URI. If a client makes a GET
request for http://www.example.com/page.html,
then the URI path is /page.html
and
the value of the Host
header is
“www.example.com” or “www.example.com:80.”
From the client’s point of view, this may seem like a strange
header to require. It’s required because an HTTP 1.1 server can host
any number of domains on a single IP address. This feature is called
“name-based virtual hosting,” and it saves someone who owns multiple
domain names from having to buy a separate computer and/or network
card for each one. The problem is that an HTTP client sends requests
to an IP address, not to a domain name. Without the Host
header, the server has no idea which of
its virtual hosts is the target of the client’s request.
If-Match
Type: Request header.
Importance: Medium.
This header is best described in terms of other headers. It’s used like
If-Unmodified-Since
(described
later), to make HTTP actions other than GET conditional. But where
If-Unmodified-Since
takes a time as
its value, this header takes an ETag
as its value.
Tersely, this header is to If-None-Match
and ETag
as If-Unmodified-Since
is to If-Modified-Since
and
Last-Modified
.
If-Modified-Since
Type: Request header.
Importance: Very high.
This request header is the backbone of conditional HTTP GET. Its value is a
previous value of the Last-Modified
response header, obtained from a previous request to this URI. If the
resource has changed since that last request, its new Last-Modified
date is more recent than the
one. That means that the condition If-Modified-Since
is met, and the server
sends the new entity-body. If the resource has not changed, the
Last-Modified
date is the same as it was, and the condition If-Modified-Since
fails. The server sends a
response code of 304 (“Not Modified”) and no entity-body. That is,
conditional HTTP GET succeeds if this condition fails.
Since Last-Modified
is only
accurate to within one second, conditional HTTP GET can occasionally
give the wrong result if it relies only on If-Modified-Since
. This is the main reason
why we also use ETag
and If-None-Match
.
If-None-Match
Type: Request header.
Importance: Very high.
This header is also used in conditional HTTP GET. Its value is a previous value
of the ETag
response header,
obtained from a previous request to this URI. If the ETag has changed
since that last request, the condition If-None-Match
succeeds and the server sends
the new entity-body. If the ETag is the same as before, the condition
fails, and the server sends a response code of 304 (“Not Modified”)
with no entity-body.
If-Range
Type: Request header.
Importance: Low.
This header is used to make a conditional partial GET request.
The value of the header comes from the ETag
or Last-Modified
response header from a
previous range request. The server sends the new range only if
that part of the entity-body has changed.
Otherwise the server sends a 304 (“Not Modified”), even if something
changed elsewhere in the entity-body.
Conditional partial GET is not used very often, because it’s very unlikely that a client will fetch a few bytes from a larger representation, and then try to fetch only those same bytes later.
If-Unmodified-Since
Type: Request header.
Importance: Medium.
Normally a client uses the value of the response header Last-Modified
as the value of the request header If-Modified-Since
to perform a conditional
GET request. This header also takes the value of Last-Modified
, but it’s usually used for
making HTTP actions other than GET into conditional actions.
Let’s say you and many other people are interested in modifying a particular resource. You fetch a representation, modify it, and send it back with a PUT request. But someone else has modified it in the meantime, and you either get a response code of 409 (“Conflict”), or you put the resource into a state you didn’t intend.
If you make your PUT request conditional on If-Unmodified-Since
, then if someone else
has changed the resource your request will always get a response code
of 412 (“Precondition Failed”). You can refetch the representation and
decide what to do with the new version that someone else
modified.
This header can be used with GET, too; see the Range
header for an example.
Last-Modified
Type: Response header.
Importance: Very high.
This header makes conditional HTTP GET possible. It tells the client the
last time the representation changed. The client can keep track of
this date and use it in the If-Modified-Since
header of a future
request.
In web applications, Last-Modified
is usually the current time,
which makes conditional HTTP GET useless. Web services should try to
do a little better, since web service clients often besiege their
servers with requests for the same URIs over and over again. See Conditional GET” in Chapter 8 for
ideas.
Location
Type: Response header.
Importance: Very high.
This is a versatile header with many related functions. It’s heavily associated with the 3xx (“Redirection”) response codes, and much of the confusion surrounding HTTP redirects has to do with how this header should be interpreted.
This header usually tells the client which URI it should be using to access a resource; presumably the client doesn’t already know. This might be because the client’s request created the resource—response code 201 (“Created”)—or caused the resource to change URIs—301 (“Moved Permanently”). It may also be because the client used a URI that’s not quite right, though not so wrong that the server didn’t recognize it. In that case the response code might be 301 again, or 307 (“Temporary Redirect”) or 302 (“Found”).
Sometimes the value of Location
is just a default URI: one of many
possible resolutions to an ambiguous request, e.g., 300 (“Multiple
Choices”). Sometimes the value of Location
points not to the resource the
client tried to access, but to some other resource that provides
supplemental information, e.g., 303 (“See Other”).
As you can see, this header can only be understood in the context of a particular HTTP response code. Refer to the appropriate section of Appendix B for more details.
Max-Forwards
Type: Request header.
Importance: Very low.
This header is mainly used with the TRACE method, which is used to track the proxies
that handle a client’s HTTP request. I don’t cover TRACE in this book,
but as part of a TRACE request, Max-Forwards
is used to limit how many
proxies the request can be sent through.
Pragma
Type: Request or response.
Importance: Very low.
The Pragma
header
is a spot for special directives between the client,
server, and intermediaries such as proxies. The only official pragma
is “no-cache,” which is obsolete in HTTP 1.1: it’s the same as sending
a value of “no-cache” for the Cache-Control
header. You may define your
own HTTP pragmas, but it’s better to define your own HTTP headers
instead. See, for instance, the X-Proxy-Directive
header I made up while
explaining the Connection
header.
Proxy-Authenticate
Type: Response header.
Importance: Low to medium.
Some clients (especially in corporate environments) can only get HTTP access through a proxy
server. Some proxy servers require authentication. This header is a
proxy’s way of demanding authentication. It’s sent along with a
response code of 407 (“Proxy Authentication Required”), and it works
just like WWW-Authenticate
, except
it tells the client how to authenticate with the proxy, not with the
web server on the other end. While the response to a WWW-Authenticate
challenge goes into
Authorization
, the response to a
Proxy-Authenticate
challenge goes
into Proxy-Authorization
(see
below). A single request may need to include both Authorization
and Proxy-Authorization
headers: one to
authenticate with the web service, the other to authenticate with the
proxy.
Since most web services don’t include proxies in their architecture, this header is not terribly relevant to the kinds of services covered in this book. But it may be relevant to a client, if there’s a proxy between the client and the rest of the web.
Proxy-Authorization
Type: Request header.
Importance: Low to medium.
This header is an attempt to get a request through a proxy that demands
authentication. It works similarly to Authorization
. Its format depends on the
scheme defined in Proxy-Authenticate
, just as the format of
Authorization
depends on the scheme
defined in WWW-Authenticate
.
Range
Type: Request.
Importance: Medium.
This header signifies the client’s attempt to request only part of a resource’s
representation (see Partial GET” in Chapter 8). A client typically sends this header
because it tried earlier to download a large representation and got
cut off. Now it’s back for the rest of the representation. Because of
this, this header is usually coupled with Unless-Modified-Since
. If the representation
has changed since your last request, you probably need to GET it from
the beginning.
Referer
Type: Request header.
Importance: Low.
When you click a link in your web browser, the browser sends an HTTP request in
which the value of the Referer
header is the URI of the page you were just on. That’s the URI that
“refered” your client to the URI you’re now requesting. Yes, it’s
misspelled.
Though common on the human web, this header is rarely found on the programmable web. It can be used to convey a bit of application state (the client’s recent path through the service) to the server.
Retry-After
Type: Response header.
Importance: Low to medium.
This header usually comes with a response code that denotes failure: either 413 (“Request Entity Too Large”), or one of the 5xx series (“Server-side error”). It tells the client that while the server couldn’t fulfill the request right now, it might be able to fulfill the same request at a later time. The value of the header is the time when the client should try again, or the number of seconds it should wait.
If a server chooses every client’s Retry-After
value using the same rules, that
just guarantees the same clients will make the same requests in the
same order a little later, possibly causing the problem all over
again. The server should use some randomization technique to vary
Retry-After
, similar to Ethernet’s
backoff period.
TE
Type: Request header.
Importance: Low.
This is another “Accept”-type header, one that lets the client specify which transfer
encodings it will accept (see Transfer-Encoding
below for an explanation
of transfer encodings). HTTP: The Definitive
Guide by Brian Totty and David Gourley (O’Reilly) points
out that a better name would have been
“Accept-Transfer-Encoding.”
In practice, the value of TE
only conveys whether or not the client understands chunked encoding
and HTTP trailers, two topics I don’t really cover in this
book.
Trailer
Type: Response header.
Importance: Low.
When a server sends an entity-body using chunked transfer encoding, it may choose to put
certain HTTP headers at the end of the entity-body rather than before
it (see below for details). This turns them from headers into
trailers. The server signals that it’s going to send a header as a
trailer by putting its name as the value of the header called Trailer
. Here’s one possible value for
Trailer
:
Trailer: Content-Length
The server will be providing a value for Content-Length
once it’s served the
entity-body and it knows how many bytes it served.
Transfer-Encoding
Type: Response.
Importance: Low.
Sometimes a server needs to send an entity-body without knowing important facts like how
large it is. Rather than omitting HTTP headers like Content-Length
and Content-MD5
, the server may decide to send
the entity-body in chunks, and put Content-Length
and the like at the
after of the entity-body rather than before. The
idea is that by the time all the chunks have been sent, the server
knows the things it didn’t know before, and it can send Content-Length
and Content-MD5
as “trailers” instead of
“headers.”
It’s an HTTP 1.1 requirement that clients support chunked transfer-encoding, but I don’t know of any programmable clients (as opposed to web browsers) that do.
Upgrade
Type: Request header.
Importance: Very low.
If you’d rather be using some protocol other than HTTP, you can tell the server that by
sending a Upgrade
header. If the
server happens to speak the protocol you’d rather be using, it will
send back a response code of 101 (“Switching Protocols”) and
immediately begin speaking the new protocol.
There is no standard format for this list, but the sample
Upgrade
header from RFC 2616 shows
what the designers of HTTP had in mind:
Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11
User-Agent
Type: Request header.
Importance: High.
This header lets the server know what kind of software is making the HTTP request. On the human web this is a string that identifies the brand of web browser. On the programmable web it usually identifies the HTTP library or client library that was used to write the client. It may identify a specific client program instead.
Soon after the human web became popular, servers started
sniffing User-Agent
to determine
what kind of browser was on the other end. They then sent different
representations based on the value of User-Agent
. Elsewhere in this book I’ve
voiced my opinion that it’s not a great idea to have request headers
like Accept-Language
be the only
way a client can distinguish between different representations of the
same resource. Sending different representations based on the value of
User-Agent
is an even worse idea.
Not only has User-Agent
sniffing
perpetuated incompatibilities between web browsers, it’s led to an
arms race inside the User-Agent
header itself.
Almost every browser these days pretends to be Mozilla, because
that was the internal code-name of the first web browser to become
popular (Netscape Navigator). A browser that doesn’t pretend to be
Mozilla may not get the representation it needs. Some pretend to be
both Mozilla and MSIE, so they can trigger code for the current most
popular web browser (Internet Explorer). A few browsers even allow the
user to select the User-Agent
for
every request, to trick servers into sending the right
representations.
Don’t let this happen to the programmable web. A web service
should only use User-Agent
to
gather statistics and to deny access to poorly-programmed clients. It
should not use User-Agent
to tailor
its representations to specific clients.
Vary
Type: Response header.
Importance: Low to medium.
The Vary
header tells
the client which request headers it can vary to get
different representations of a resource. Here’s a sample value:
Vary: Accept Accept-Language
That value tells the client that it can ask for the
representation in a different file format, by setting or changing the
Accept
header. It can ask for the
representation in a different language, by setting or changing
Accept-Language
.
That value also tells a cache to cache (say) the Japanese
representation of the resource separately from the English
representation. The Japanese representation isn’t a brand new byte
stream that invalidates the cached English version. The two requests
sent different values for a header that varies (Accept-Language
), so the responses should be
cached separately. If the value of Vary
is “*”, that means that the response
should not be cached.
Via
Type: Request and response header.
Importance: Low.
When an HTTP request goes directly from the client to the
server, or a response goes directly from server to client, there is no
Via
header. When there are
intermediaries (like proxies) in the way, each one slaps on a Via
header on the request or response
message. The recipient of the message can look at the Via
headers to see the path the HTTP message
took through the intermediaries.
Warning
Type: Response header (can technically be used with requests).
Importance: Low.
The Warning
header is a
supplement to the HTTP response code. It’s usually inserted by an
intermediary like a caching proxy, to tell the user about possible
problems that aren’t obvious from looking at the response.
Like response codes, each HTTP warning has a three-digit numeric
value: a “warn-code.” Most warnings have to do with cache behavior.
This Warning
says that the caching
proxy at localhost:9090
sent a
cached response even though it knew the response to be stale:
Warning: 110 localhost:9090 Response is stale
The warn-code 110 means “Response is stale” as surely as the HTTP response code 404 means “Not Found.” The HTTP standard defines seven warn-codes, which I won’t go into here.
WWW-Authenticate
Type: Response header.
Importance: Very high.
This header accompanies a response code of 401 (“Unauthorized”). It’s the server’s demand that the client send some authentication next time it requests the URI. It also tells the client what kind of authentication the server expects. This may be HTTP Basic auth, HTTP Digest auth, or something more exotic like WSSE.
Nonstandard Headers
Many, many new HTTP headers have been created over the years, most using
the X-
extension. These have not gone
through the process to be made official parts of HTTP, but in many cases
they have gone through other standardization processes. I’m going to
present just a few of the nonstandard headers that are most important to
web services.
Cookie
Type: Request header.
Importance: High on the human web, low on the programmable web.
This is probably the second-most-famous HTTP header, after Content-Type
, but it’s not in the HTTP
standard; it’s a Netscape extension.
A cookie is an agreement between the client and the server where
the server gets to store some semipersistent state on the client side
using the Set-Cookie
header (see
below). Once the client gets a cookie, it’s expected to return it with
every subsequent HTTP request to that server, by setting the Cookie
header once for each of its cookies.
Since the data is sent invisibly in the HTTP headers with every
request, it looks like the client and server are sharing state.
Cookies have a bad reputation in REST circles for two reasons. First, the “state” they contain is often just a session ID: a short alphanumeric key that ties into a much larger data structure on the server. This destroys the principle of statelessness. More subtly, once a client accepts a cookie it’s supposed to submit it with all subsequent requests for a certain time. The server is telling the client that it can no longer make the requests it made precookie. This also violates the principle of statelessness.
If you must use cookies, make sure you store all the state on the client side. Otherwise you’ll lose a lot of the scalability benefits of REST.
POE
Type: Request header.
Importance: Medium.
The POE
header is sent by a
client who wants a URI they can use in a Post Once
Exactly request. I covered POE in Chapter 9.
POE-Links
Type: Response header.
Importance: Medium.
The POE-Links
header
is sent in response to a request that included a
POE
header. It gives one or more
URIs the client can POST to. Each listed URI will respond to POST
exactly once.
Set-Cookie
Type: Response header.
Importance: High on the human web, low on the programmable web.
This is an attempt on the server’s part to set some
semipersistent state in a cookie on the client side. The client is
supposed to send an appropriate Cookie
header with all future requests,
until the cookie’s expiration date. The client may ignore this header
(and on the human web, that’s often a good idea), but there’s no
guarantee that future requests will get a good response unless they
provide the Cookie
header. This
violates the principle of statelessness.
Slug
Type: Request header.
Importance: Fairly high, but only in APP applications.
The Slug
header is defined by
the Atom Publishing Protocol as a way for a client to specify a title
for a binary document when it POSTs that document to a collection. See
Binary documents as APP members” in Chapter 9
for an example.
X-HTTP-Method-Override
Type: Request header.
Importance: Low to medium.
Some web services support this header as a way of making PUT, DELETE, and other requests using overloaded POST. The idea is to accommodate clients that don’t support or can’t use the real HTTP methods. Such a client would use POST and put the “real” HTTP method in this header. If you’re designing a service and want to support this feature, I recommend putting the “real” HTTP method in the URI’s query string. See Faking PUT and DELETE” in Chapter 8 for more details.
X-WSSE
Type: Request header.
Importance: Medium.
This is a custom header defined by the WSSE Username Token standard I described in Chapter 8. It’s sent in conjunction with the Authorization
header, and it contains the
actual WSSE credentials. Why did the WSSE designers create a separate
header instead that goes along with Authorization
, instead of just using
Authorization
? Because WSSE was
designed to be processed by CGI programs rather than by web
servers.
When a web server invokes a CGI program, it doesn’t pass in the
contents of the Authorization
header. Web servers think they’re in charge of HTTP authentication.
They don’t understand Authorization: WSSE
profile="UsernameToken"
, so they ignore it, and assume
there’s no authentication required. The Authorization
header never makes it into the
CGI program. But the CGI standard requires that web servers pass on
the values of any X-
headers. The
X-WSSE
header is a way of smuggling
authentication credentials through a web server that doesn’t
understand what they mean.
Get RESTful Web Services now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.