Introducing Modules
Modules are additional sets of elements, giving the feed a greater range of expression: they allow the specification to be extended without actually being changed, which is a very clever trick. You can make your own module match any data you might wish to syndicate. Admittedly, most aggregators will ignore it, but your own applications can take advantage of it. And, happily, the most popular modules are increasingly being supported by the latest aggregators as a matter of course.
Modules in RSS, both Versions 2.0 and 1.0, are created with a system known as XML Namespaces. Namespaces are the XML solution to the classic language problem of one word meaning two things in different contexts. Take “Windows,” for example. In the context of houses, “windows” are holes in the wall through which we can look. In the context of computers, “Windows” is a trademark of the Microsoft Corporation and refers to its range of operating systems. The context within which the name has a particular meaning is called its namespace.
In XML, you can distinguish between the two meanings by assigning a namespace and placing the namespace’s name in front of the element name, separated by a colon, like this:
<computing:windows>This is an operating system</computing:windows> <building:windows>This is a hole in a wall</building:windows>
Namespaces solve two problems. First, they allow you to distinguish between different meanings for words that are spelled the same way, which means you can use words more than once for different meanings. Second, they allow you to group together words that are related to each other; for example, using a computer to look through an XML document for all elements with a certain namespace is easy.
Both RSS 1.0 and 2.0 use namespaces to allow for modularization . This modularization means that developers can add new features to RSS documents without changing the core specification.
Modularization has great advantages over the older RSS 0.9x’s method for including new elements. For starters, anyone can create a module: there are no standards issues or any need for approval, aside from making sure that the namespace URI you use has not been used before. And, it means both RSS 1.0 and 2.0 are potentially far more powerful than RSS 0.9x ever was.
A module works in the actual RSS document by declaring a
namespace
within the root element of the feed and
by prefixing the element’s names with that namespace
prefix, like so:
<?xml version="1.0"?> <rss version="2.0" xmlns:blogChannel="http://backend.userland.com/blogChannelModule"> ... <blogChannel:blink>http://www.benhammersley.com</blogChannel:blink> ...
You should note that the URI the namespace declaration points to is the unique identifier of the namespace and not the namespace prefix. In other words, from the perspective of a program processing XML, this:
<?xml version="1.0"?> <rss version="2.0" xmlns:blogChannel="http://backend.userland.com/blogChannelModule"> ... <blogChannel:blink>http://www.benhammersley.com</blogChannel:blink> ...
is absolutely identical to this:
<?xml version="1.0"?> <rss version="2.0" xmlns:bingbangbong="http://backend.userland.com/blogChannelModule"> ... <bingbandbong:blink>http://www.benhammersley.com</bingbangbong:blink>...
This will become clear as we study some common modules. It is customary, and also very good manners, to have documentation for the module to be found at the namespace’s URI, but this isn’t technically necessary. As discussed in Chapter 11, the different feed standards have different scopes for the form this documentation can take. The presence of anything at all at the namespace URI is entirely optional, both in terms of RSS and within the scope of the broader XML specification itself.
blogChannel Module
Designed by Dave Winer only a week after he formalized RSS 2.0, the
blogChannel
module allows the inclusion of data used
by weblogging applications and, specifically, the newer generation of
aggregating and filtering systems.
It consists of three optional elements, all of which are subelements
of channel
and have the following namespace
declaration:
xmlns:blogChannel="http://backend.userland.com/blogChannelModule"
The elements are:
-
blogChannel:blogRoll
Contains a literal string that is the URL of an OPML file containing the blogroll for the site. A blogroll is the list of blogs the blog author habitually reads.
-
blogChannel:blink
Contains a literal string that is the URL of a site the blog author recommends the reader visits.
-
blogChannel:mySubscriptions
Contains a literal string that is the URL of the OPML file containing the URLs of the RSS feeds to which the blog author is subscribed in her desktop reader.
Example 4-4 shows the beginning of an RSS 2.0 feed
using the blogChannel
module.
<?xml version="1.0"?> <rss version="2.0" xmlns:blogChannel="http://backend.userland.com/blogChannelModule"> <channel> <title>RSS2.0Example</title> <link>http://www.exampleurl.com/example/index.html</link> <description>This is an example RSS 2.0 feed</description> <blogChannel:blogRoll>http://www.exampleurl.com/blogroll.opml</blogChannel:blogRoll> <blogChannel:blink>http://www.benhammersley.com</blogChannel:blink> <blogChannel:mySubscriptions>http://www.exampleurl.com/mySubscriptions.opml </blogChannel:mySubscriptions> ...
Creative Commons Module
Also designed by Dave Winer, the Creative Commons module allows RSS 2.0 feeds to specify which Creative Commons license applies to them. The Creative Commons organization, http://creativecommons.org/, offers a variety of content licenses that allow feed publishers to release content under more flexible copyright restrictions than previously available. Feed consumers can consult the license to see how they can reuse the content for their own work.
The element can apply to either the complete
channel
or the individual item
.
It consists of only one element,
creativeCommons:license
, which contains the URL of
the Creative Commons license on the Creative Commons site. It has the
following namespace declaration:
xmlns:creativeCommons=" http://backend.userland.com/creativeCommonsRssModule"
In action, it looks like Example 4-5.
<rss version="2.0" xmlns:creativeCommons="http://backend.userland.com/ creativeCommonsRssModule"> <channel> <title>Creative Commons Example</title> <link>http://www.example.com/</link> <creativeCommons:license>http://www.creativecommons.org/licenses/by-nd/1.0 </creativeCommons:license> ... <item> <description>blah blah blah</description> <creativeCommons:license>http://www.creativecommons.org/licenses/by-nc/1.0 </creativeCommons:license> </item> ...
Note that a creativeCommons:license
element on an
item
overrides the same on the
channel
for that item
.
More details can be found at:
http://backend.userland.com/creativeCommonsRssModule |
Simple Semantic Resolution Module
One of the never-ending arguments within the RSS world is that between the pro- and anti-RDF camps. The fork between RSS 0.91 and 1.0 was almost entirely caused by this disagreement. The pro-RDF camp stated, quite rightly, that RDF data has a great deal more meaningful utility than plain XML, whilst the anti-RDF camp stated, also quite rightly, that the RDF syntax was horrible, and that no one can understand it without reading the documentation and having a nice lie down.
That may be—we’ll find out your own feelings on this in the next chapters—but in the meantime, the Simple Semantic Resolution module was one idea put forward to bridge the divide between the two cultures.
Written by Danny Ayers, its presence in an RSS 2.0 feed simply means
“this data should be considered RDF, and to use it
with an RDF-compatible application you should apply this
transformation to it first.” Whereupon, it points
you to a nice XSLT stylesheet. That stylesheet consists of one single
element, a subelement of channel
, and has the
following namespace declaration:
xmlns:ssr="http://purl.org/stuff/ssr"
The element is:
Example 4-6 shows the SSR module in use.
<?xml version="1.0"?> <rss version="2.0" xmlns:ssr="http://purl.org/stuff/ssr"> <ssr:rdf transform="http://w3future.com/weblog/gems/rss2rdf.xsl" /> ...
More details can be found at http://ideagraph.net/xmlns/ssr/.
Trackback Module
The trackback system for weblog content management systems (see http://www.movabletype.org/docs/mttrackback.html for the technical details) has grown up in the same neighborhood as RSS, so it’s only fair that the one should be represented in the other.
This module, also available in tasty RSS 1.0, comes from Justin
Klubnik and allows RSS 2.0 feeds to display both the URL that people
should trackback to, but also the URL that the
item
has trackbacked itself. The idea is that
aggregators can send pings and also follow links to find related
pages, because item
s might ping places they
don’t explicitly link to.
This module is made up of two elements, subelements of
item
, and has the following namespace declaration:
xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/"
Here are the elements:
-
trackback:ping
This contains the
item
’s trackback URL:
<trackback:ping>http://foo.com/trackback/tb.cgi?tb_id=20020923</trackback:ping>
More details can be found at http://madskills.com/public/xml/rss/module/trackback/.
ICBM Module
This module, written by Matt Croydon and
Kenneth Hunt, allows RSS feeds to state the geographical location of
the origin of the feed or an individual item
within it.
It’s alleged that ICBM does actually stand for intercontinental ballistic missile, and certainly a half-arsed attempt at Googling for it produces only the explanation that describing one’s position as an ICBM address is so that, should anyone wish, your data will allow the baddies to target you directly, presumably for being far too clever with your syndication feeds.
Either way, the namespace declaration is thus:
xmlns:icbm="http://postneo.com/icbm"
It contains two elements, usable in either the
channel
or the item
context.
The item
context overrides the former, as you
might expect.
That’s my house, actually.
Go to http://www.postneo.com/icbm/ for more verbose details on the thinking behind the specification.
Yahoo!’s Media RSS Module
In December 2004, Yahoo! launched a beta video search engine at http://video.search.yahoo.com/. The original system spidered the Web looking for video files and indexed them with the implied information found in the filename and link text. To make it easier for video content producers to have Yahoo! index their sites, and to give the search engine much better data to play with, Yahoo! is now offering to regularly spider RSS feeds containing details of media files. This additional data is encoded in its new Media RSS Module.
That module consists of one element,
<media:content>
, with a namespace
declaration of:
xmlns:media="http://tools.search.yahoo.com/mrss/"
and four optional subelements.
<media:content>
is a subelement of
item
and consists of ten optional attributes.
-
url
url
specifies the direct URL to the media object. It is an optional attribute. If a URL isn’t included, aplayerURL
must be specified.-
fileSize
The size, in bytes, of the media object. It is an optional attribute.
-
type
The standard MIME type of the object. It is an optional attribute.
-
playerURL
playerURL
is the URL of the media player console. It is an optional attribute.-
playerHeight
playerHeight
is the height of the window theplayerURL
should be opened in. It is an optional attribute.-
playerWidth
playerWidth
is the width of the window theplayerURL
should be opened in. It is an optional attribute.-
isDefault
isDefault
determines if this is the default object that should be used for this element. It can betrue
orfalse
. So, if anitem
contains more than onemedia:content
element, setting this totrue
makes it the default. It’s an optional attribute but can be used only once within eachitem
.-
expression
expression
determines if the object is a sample or the full version of the object. It can be eithersample
orfull
. It is an optional attribute.-
bitrate
The bit rate of the file, in kilobits per second. It is an optional attribute.
-
duration
The number of seconds the media plays, for audio and video. It is an optional attribute.
There are also four optional subelements to
<media:content>
, which can be also used as
subelements to item
:
-
<media:thumbnail>
Allows a particular image to be used as the representative image for the media object:
<media:thumbnail height="50" width="50"> http://www.foo.com/keyframe.jpg</media:thumbnail>
It takes two optional attributes.
height
specifies the height of the thumbnail.width
specifies the width of the thumbnail.-
<media:category>
Allows a taxonomy to be set that gives an indication of the type of media content and its particular contents:
<media:category>music/artist name/album/song</media:category> <media:category>television/series/episode/episode number</media:category>
-
<media:people>
Lists the notable individuals or businesses and their contribution to the creation of the media object.
<media:people role="editor">Simon St Laurent</media:people>
role
specifies the role individuals played. Examples include:producer
,artist
,news
anchor
,cast
member
, etc. It is an optional attribute.-
<media:text>
Allows the inclusion of a text transcript, closed captioning, or lyrics of the media content:
<media:text>Oh, say, can you see, by the dawn's early light,</media:text>
Once your site has a feed working with the Media RSS Module, like that shown in Example 4-7, you can submit it to Yahoo! at http://tools.search.yahoo.com/mrss/submit.html.
<media:content url="http://www.example.com/movie.mov" fileSize="12345678" type= "video/quicktime" playerUrl="http://http://www.example.com/player?id=1" playerHeight="200" playerWidth="400" isDefault="true" expression="full" bitrate="128" duration="185"> <media:thumbnail height="50" width="50">http://www.example.com/thumbnail.jpg thumbnail></media: <media:category>comedy/slapstick/custard</media:category> <media:people role="stuntman">Ben Hammersley</media:people> <media:text>Take that! And that! And that!</media:text> </media:content>
The development of your own modules is covered in Chapter 11.
Get Developing Feeds with RSS and Atom now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.