Use XML Namespaces in an XML Vocabulary

Though controversial, XML namespaces are a necessity if you want to manage XML documents in the wild. This hack gets into some of the nitty-gritty of namespaces so you can more easily untangle them.

In January 1999, the W3C published its Namespaces in XML recommendation (http://www.w3.org/TR/REC-xml-names/), about a year after the XML recommendation arrived. There were hints of namespaces in the original XML spec, evidenced by suggestions about the use of colons, but that was about it. On the surface, namespaces appear reasonable enough, but their implications have been the subject of confusion and criticism for over five years.

Namespaces were mentioned briefly in [Hack #7] . In this hack, we’ll talk about how namespaces work in more detail.

Look at the following document, namespace.xml , in Example 4-1.

Example 4-1. namespace.xml

<?xml version="1.0" encoding="UTF-8"?>
   
<!-- a time instant -->
<time timezone="PST" xmlns="http://www.wyeast.net/time">
 <hour>11</hour>
 <minute>59</minute>
 <second>59</second>
 <meridiem>p.m.</meridiem>
 <atomic signal="true"/>
</time>

This document isn’t very different from time.xml except for the special xmlns attribute on the time element. The xmlns attribute and its value http://www.wyeast.net/time are considered a default namespace declaration. A default namespace declaration associates a namespace name —always a Uniform Resource Identifier or URI (http://www.ietf.org/rfc/rfc2396.txt)—with one or more elements. A local name together with its namespace name is called an expanded name , and is often given as {http://www.wyeast.net/time}time in descriptive text.

The default declaration in namespace.xml associates the namespace name http://www.wyeast.net/time with the element time and its child elements hour, minute, second, meridiem, and atomic. A default namespace declaration applies only to the element where it is declared and any of its child or descendent elements. A default declaration on the document element therefore applies to elements in the entire document. It does not apply to attributes, however. You must use a prefix in order to apply a namespace to an attribute.

Instead of a default declaration, you can also get more specific by using a prefix with a namespace. This is shown in prefix.xml (Example 4-2).

Example 4-2. prefix.xml

<?xml version="1.0" encoding="UTF-8"?>
   
<!-- a time instant -->
<tz:time timezone="PST" xmlns:tz="http://www.wyeast.net/time">
 <tz:hour>11</tz:hour>
 <tz:minute>59</tz:minute>
 <tz:second>59</tz:second>
 <tz:meridiem>p.m.</tz:meridiem>
 <tz:atomic tz:signal="true"/>
</tz:time>

In this declaration (again on the time element), the prefix tz is associated with the namespace name or URI. So any element or attribute in the document that is prefixed with tz will be associated with the namespace http://www.wyeast.net/time.

The timezone attribute on time does not have a prefix, so it is not associated with the namespace. In fact, the only way you can associate an attribute with a namespace is with a prefix. Default namespace declarations never apply to attributes.

The special namespace prefix xml is bound to the namespace URI http://www.w3.org/XML/1998/namespace, and is used with attributes such as xml:lang and xml:space. Because it is built in, it doesn’t have to be declared, but you can declare it if you want. However, you are not allowed to bind xml: to any other namespace name, and you can’t bind any other prefix to http://www.w3.org/XML/1998/namespace.

xmlns is a special attribute and can be used as a prefix (http://www.w3.org/TR/REC-xml-names/#ns-decl). As the result of an erratum in http://www.w3.org/XML/xml-names-19990114-errata, the prefix xmlns: was bound to the namespace name http://www.w3.org/2000/xmlns/. Unlike the prefix xml:, xmlns cannot be declared, and no other prefix may be bound to http://www.w3.org/2000/xmlns/.

The intent of namespaces was to allow different vocabularies to be mixed together in a single document and to avoid the collision of names in environments where the names might run into each other. The unfortunate and confusing part about namespaces is their use of any URI as a namespace name. The scheme or protocol name http:// suggests that the URI identify a resource that can be retrieved using Hypertext Transfer Protocol (http://www.ietf.org/rfc/rfc2616.txt), just like any web resource would be retrieved. But this is not the case. The URI is just considered a name, not a guarantee of the location or existence of a resource. Fortunately, a technique that uses RDDL [Hack #60] can help overcome this annoyance. The nice thing about URIs, on the other hand, is that they are allocated locally; that is, you don’t have to deal with a global registry to use them. However, the downside of that is you can’t really police people who might use a domain name that you own as part of a URI.

A new namespace spec was created for use only with XML 1.1 (http://www.w3.org/TR/xml-names11). Notably, this spec allows you to undeclare a previously declared namespace; that is, with xmlns="" you can undeclare a default namespace declaration, and with xmlns:tz="" you can undeclare a namespace associated with the prefix tz. In Version 1.0 of the namespaces spec, a default namespace may be empty (as in xmlns="“), but you cannot redeclare a namespace as you can in Version 1.1.

See Also

Get XML Hacks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.