The Basic Structure
The top level of an RSS 2.0
document is the rss version="2.0
" element. This is
followed by a single channel
element. The
channel
element contains the entire feed contents
and all associated metadata.
Required Channel Subelements
There are 3 required and
16 optional subelements of channel
within RSS 2.0.
Here are the required subelements:
-
link
A URL pointing to the associated resource, usually a web site. The link must be an IANA-registered URI scheme, such as http://, https://, news://, or ftp://, though it isn’t necessary for a application developer to support all these by default. The most common by a large margin is http://. For example:
<link>http://www.benhammersley.com</link>
Although it isn’t explicitly stated in the
specification, it is highly recommended that you do
not put anything other than plain text in the
channel/title
or
channel/description
elements. There are some
existing feeds with HTML within those elements, but these cause a
considerable amount of wailing, and at least a small amount of
gnashing of teeth. Do not do it. Use plain text
only in these elements. The following sidebar,
“Including HTML Within title or
description,” gives a fuller account of this, but in
my opinion it’s a bad idea.
Optional Channel Subelements
There are 16 optional
channel
subelements of RSS 2.0. Technically
speaking, you can leave these out altogether. However, I encourage
you to add as many as you can. Much of this stuff is static; the
content of the element never changes. Placing it into your RSS
template or adding another line to a script is little work for the
additional value of your feed’s metadata. This is
especially true for the first three subelements listed here:
-
pubDate
The publication date of the content within the feed. For example, a daily morning newspaper publishes at a certain time early every morning. Technically, any information in the feed should not be displayed until after the publication date, so you can set
pubDate
to a time in the future and expect that the feed won’t be displayed until after that time. Few existing RSS readers take any notice of this element in this way, however. Nevertheless, it should be in the format outlined in RFC 822:<pubDate>Sun, 12 Sep 2004 19:00:40 GMT</pubDate>
-
lastBuildDate
The date and time, RFC 822-style, when the feed last changed. Note the difference between this and
channel/pubDate
.lastBuildDate
must be in the past. It is this element that feed applications should take as the “last time updated” value and notchannel/pubDate
.<pubDate>Sun, 12 Sep 2004 19:01:55 GMT</pubDate>
-
category
Identical in syntax to the
item/category
element you’ll see later. This takes one optional attribute,domain
. The value ofcategory
should be a forward-slash-separated string that identifies a hierarchical location in a taxonomy represented by thedomain
attribute. Sadly, there is no consensus either within the specification or in the real world as to any standard format for thedomain
attribute. It would seem most sensible to restrict it to a URL; however, it needn’t necessarily be so.<category domain="Syndic8">1765</category>
-
docs
A URL that points to an explanation of the standard for future reference. This should point to http://blogs.law.harvard.edu/tech/rss:
<docs>http://blogs.law.harvard.edu/tech/rss</docs>
-
cloud
The
<cloud/>
element enables a rarely used feature known as “Publish and Subscribe,” which we shall investigate fully in Chapter 9. It takes no value itself, but it has five mandatory attributes, themselves also explained in Chapter 9:domain
,path
,port
,registerProcedure
, andprotocol
.<cloud domain="rpc.sys.com" port="80" path="/RPC2" registerProcedure= "pingMe" protocol="soap"/>
-
ttl
ttl
, short for Time-to-Live, should contain a number, which is the minimum number of minutes the reader should wait before refreshing the feed from its source. Feed authors should adjust this figure to reflect the time between updates and the number of times they wish their feed to be requested, versus how up to date they need their consumers to be.<ttl>60</ttl>
-
image
This describes a feed’s accompanying image. It’s optional, but many aggregators look prettier if you include one. It has three required and two optional subelements of its own:
-
url
The URL of a GIF, JPG, or PNG image that corresponds to the feed. It is, quite obviously, required.
-
title
A description of the image, normally used within the ALT attribute of HTML’s
<img>
tag. It is required.-
link
The URL to which the image should be linked. This is usually the same as the
channel/link
.-
width
andheight
The width and height of the icon, in pixels. The icons should be a maximum of 144 pixels wide by 400 pixels high. The emergent standard is 88 pixels wide by 31 pixels high. Both elements are optional.
<image> <title>RSS2.0 Example</title> <url>http://www.exampleurl.com/example/ images/logo.gif</url> <link>http://www.exampleurl.com/example/index.html</link> <width>88</width> <height>31</height> <description>The World's Leading Technical Publisher</description> </image>
-
-
rating
The PICS rating for the feed; it helps parents and teachers control what children access on the Internet. More information on PICS can be found at http://www.w3.org/PICS/. This labeling scheme is little used at present, but an example of a PICS rating would be:
<rating>(PICS-1.1 "http://www.gcf.org/v2.5" labels on "1994.11.05T08:15-0500" until "1995.12.31T23:59-0000" for "http://w3.org/PICS/Overview.html" ratings (suds 0.5 density 0 color/hue 1))</rating>
-
textInput
An element that lets RSS feeds display a small text box and Submit button, and associates them with a CGI application. Many RSS parsers support this feature, and many sites use it to offer archive searching or email newsletter sign-ups, for example.
textInput
has four required subelements:-
title
The label for the Submit button. It can have a maximum of 100 characters.
-
description
Text to explain what the
textInput
actually does. It can have a maximum of 500 characters.-
name
The name of the text object that is passed to the CGI script. It can have a maximum of 20 characters.
-
link
The URL of the CGI script.
<textInput> <title>Search</title> <description>Search the Archives</ description> <name>query</name> <link>http://www.exampleurl.com/example/ search.cgi</link> </textInput>
-
-
skipDays
andskipHours
A set of elements that can control when a feed user reads the feed.
skipDays
can contain up to sevenday
subelements: Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, or Sunday.skipHours
contains up to 24hour
subelements, the numbers 1-24, representing the time in Greenwich Mean Time (GMT). The client should not retrieve the feed during any day or hour listed within these two elements. The elements are ORed not ANDed: in the example here, the application is instructed not to request the feed during 8 p.m. on any day, and never on a Monday:<skipDays><day>Monday</day></skipDays> <skipHours><hour>20</hour></skipHours>
item Elements
RSS 2.0 can have any number of
item
elements. The item
element
is at the heart of RSS; it contains the primary content of the feed.
Technically, item
elements are optional, but a
syndication feed with no item
s is just a glorified
link. Not having any item
s
doesn’t mean the feed is invalid, just extremely
boring.
All item
subelements are optional, with the
proviso that at least one of item/title
or
item/description
is present. You can use this
feature to build lists (more on that later).
With item
, there are the 10 standard
item
subelements available:
-
title
Usually, this is the title of the story linked to by the
item
, but it can also be seen as a one-line list item. There is controversy over whether HTML is allowed within this element; for more information, see the sidebar Sidebar 4-1.-
link
The URL of the story the item is describing.
-
description
A synopsis of the story. The
description
can contain entity-encoded HTML. Again, as withitem/title
, see the pertinent sidebar Sidebar 4-1.-
author
This should contain the email address of the resource’s author referred to within the
item
. The specification’s example is in the formatuser@example.com
(firstname
lastname)
but isn’t explained further:<author>ben@benhammersley.com (Ben Hammersley)</author>
-
enclosure
This describes a file associated with an
item
. It has no content, but it takes three attributes:url
is the URL of the enclosure,length
is its size in bytes, andtype
is the standard MIME type for the enclosure. Some feed applications can download these files automatically. The original idea was for configuring a feed aggregator to automatically download large media files overnight, thereby deferring the extra bandwidth required. This is an underused feature of RSS 2.0 because most aggregators don’t support it, but in 2004, it became the focus of a lot of development around the idea of podcasting. See the sidebar Sidebar 4-1 for details.<enclosure url="http://www.example.com/hotxxxpron.mpg" length= "34657834" type="video/mpeg"/>
-
guid
Standing for Globally Unique Identifier, this element should contain a string that uniquely identifies the item. It must never change, and it must be unique to the object it is describing. If that content changes in any way, it must gain a new
guid
. This element also has the optional attributeisPermalink
, which, iftrue
, denotes that the value of the element can be taken as a URL to the object referred to by the item. Therefore, if noitem/link
element is present, but theisPermalink
attribute is set totrue
, the application can take the value ofguid
in its place. The specification doesn’t say what to do if both are present and aren’t the same, but it seems sensible to give preference within any application to theitem/link
element.<guid isPermalink="true">http://www.example.com/example.html</guid>
-
pubDate
The publication date of the
item
. Again, as withchannel/pubDate
, any information in theitem
shouldn’t be displayed until after the publication date, but few existing RSS readers take any notice of this element in this way. The date is in RFC 822 format.<pubDate>Mon, 13 Sep 2004 00:23:05 GMT</pubDate>
Example 4-1 shows these parts assembled into an RSS 2.0 XML document.
<?xml version="1.0"?> <rss version="2.0"> <channel> <title>RSS2.0Example</title> <link>http://www.exampleurl.com/example/index.html</link> <description>This is an example RSS 2.0 feed</description> <language>en-gb</language> <copyright>Copyright 2002, Oreilly and Associates.</copyright> <managingEditor>example@exampleurl.com</managingEditor> <webMaster>webmaster@exampleurl.com</webMaster> <rating> </rating> <pubDate>03 Apr 02 1500 GMT</pubDate> <lastBuildDate>03 Apr 02 1500 GMT</lastBuildDate> <docs>http://blogs.law.harvard.edu/tech/rss</docs> <skipDays><day>Monday</day></skipDays> <skipHours><hour>20</hour></skipHours> <category domain="http://www.dmoz.org">Business/Industries/Publishing/Publishers/ Nonfiction/Business/O'Reilly_and_Associates/</category> <generator>NewsAggregator'o'Matic</generator> <ttl>30<ttl> <cloud domain="http://www.exampleurl.com" port="80" path="/RPC2" registerProcedure="pleaseNotify" protocol="XML-RPC" /> <image> <title>RSS2.0 Example</title> <url>http://www.exampleurl.com/example/images/logo.gif</url> <link>http://www.exampleurl.com/example/index.html</link> <width>88</width> <height>31</height> <description>The World's Leading Technical Publisher</description> </image> <textInput> <title>Search</title> <description>Search the Archives</description> <name>query</name> <link>http://www.exampleurl.com/example/search.cgi</link> </textInput> <item> <title>The First Item</title> <link>http://www.exampleurl.com/example/001.html</link> <description>This is the first item.</description> <source url="http://www.anothersite.com/index.xml">Another Site</source> <enclosure url="http://www.exampleurl.com/example/001.mp3" length="543210" type"audio/mpeg"/> <category domain="http://www.dmoz.org">Business/Industries/Publishing/Publishers/ Nonfiction/Business/O'Reilly_and_Associates/</category> <comments>http://www.exampleurl.com/comments/001.html</comments> <author>Ben Hammersley</author> <pubDate>Sat, 01 Jan 2002 0:00:01 GMT</pubDate> <guid isPermaLink="true">http://www.exampleurl.com/example/001.html</guid> </item> <item> <title>The Second Item</title> <link>http://www.exampleurl.com/example/002.html</link> <description>This is the second item.</description> <source url="http://www.anothersite.com/index.xml">Another Site</source> <enclosure url="http://www.exampleurl.com/example/002.mp3" length="543210" type"audio/mpeg"/> <category domain="http://www.dmoz.org">Business/Industries/Publishing/Publishers/ Nonfiction/Business/O'Reilly_and_Associates/</category> <comments>http://www.exampleurl.com/comments/002.html</comments> <author>Ben Hammersley</author> <pubDate>Sun, 02 Jan 2002 0:00:01 GMT</pubDate> <guid isPermaLink="true">http://www.exampleurl.com/example/002.html</guid> </item> <item> <title>The Third Item</title> <link>http://www.exampleurl.com/example/003.html</link> <description>This is the third item.</description> <source url="http://www.anothersite.com/index.xml">Another Site</source> <enclosure url="http://www.exampleurl.com/example/003.mp3" length="543210" type"audio/mpeg"/> <category domain="http://www.dmoz.org">Business/Industries/Publishing/Publishers/ Nonfiction/Business/O'Reilly_and_Associates/</category> <comments>http://www.exampleurl.com/comments/003.html</comments> <author>Ben Hammersley</author> <pubDate>Mon, 03 Jan 2002 0:00:01 GMT</pubDate> <guid isPermaLink="true">http://www.exampleurl.com/example/003.html</guid> </item> </channel> </rss>
The Simplest Possible RSS 2.0 Feed
This, really, is the key to the success of RSS 2.0. The simplest thing you need to do to make the feed validate is very uncomplicated indeed (see Example 4-2). While this isn’t any help when you are trying to convey complex information, as with RSS 1.0, or if you’re trying to build a complete document-centric system, as with Atom, it is very useful for many other applications.
<?xml version="1.0" encoding="utf-8"?> <rss version="2.0"> <channel> <title>The Simplest Feed</title> <link>http://example.org/index.html</link> <description>The Simplest Possible RSS 2.0 Feed</description> <item> <description>Simple Simple Simple</description> </item> </channel> </rss>
Chapter 10 describes many useful applications that take this a minimalist approach to using RSS 2.0-compliant feeds.
Get Developing Feeds with RSS and Atom now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.