O'Reilly Hacks
oreilly.comO'Reilly NetworkSafari BookshelfConferences Sign In/My Account | View Cart   
Book List Learning Lab PDFs O'Reilly Gear Newsletters Press Room Jobs  


 
Buy the book!
XML Hacks
By Michael Fitzgerald
July 2004
More Info

How do these hacks stand up? Comment on a hack from the book by choosing the associated "Discuss" link below. You can also view the code from any of the hacks by clicking on the "Listing" or "Code" links. A number of hacks have been selected to be featured online in their entirety; you may view those hacks by clicking on the hack titles that are linked.

You can also download all the scripts and other files for this book here.

Jump to: Looking at XML Documents  | Creating XML Documents  | Transforming XML Documents  | XML Vocabularies  | Defining XML Vocabularies with Schema Languages  | RSS and Atom  | Advanced XML Hacks

Looking at XML Documents

HACK
#1

Read an XML Document
Before you can do much with an XML document, you need to understand its basic parts. This hack explores the most common structures found in XML.
[Discuss (0) | Link to this hack]

HACK
#2

Display an XML Document in a Web Browser
The most popular web browsers can display and process XML natively. Nowadays, it's just a matter of opening a file
[Discuss (0) | Link to this hack]

HACK
#3

Apply Style to an XML Document with CSS
Make an in-browser XML document more appealing by applying a CSS stylesheet to it
[Discuss (0) | Link to this hack]

HACK
#4

Use Character and Entity References
Not all characters are available on the keyboard! This hack shows you how to represent such characters in an XML document by using decimal and hexadecimal character references, and how to represent entities by using entity references
[Discuss (0) | Link to this hack]

HACK
#5

Examine XML Documents in Text Editors
Even plain-text editors offer features that make editing XML documents a pleasure. This hack introduces two options, Vim and Emacs with nXML
[Discuss (0) | Link to this hack]

HACK
#6

Explore XML Documents in Graphical Editors
Text editor not enough for you? This hack looks at XML documents with graphical editors
[Discuss (0) | Link to this hack]

HACK
#7

Choose Tools for Creating an XML Vocabulary
XML provides the syntax necessary to create your own vocabulary or dialect of XML. Here are a few things you need to know about namespaces and schemas
[Discuss (0) | Link to this hack]

HACK
#8

Test XML Documents Online
Are your XML documents syntactically correct? Find out how and where to check XML documents using online resources
[Discuss (0) | Link to this hack]

HACK
#9

Test XML Documents from the Command Line
A number of free, easy-to-use XML processors are available for use on the command line. This hack shows where to get four such tools and how to use them
[Discuss (0) | Link to this hack]

HACK
#10

Run Java Programs that Process XML
Open source, command-line Java programs that process XML are abundant. This hack shows you how to use them.
[Discuss (0) | Link to this hack]

Creating XML Documents

HACK
#11

Edit XML Documents with <oXygen/>
Quickly learn how to edit XML documents with <oXygen/>
[Discuss (0) | Link to this hack]

HACK
#12

Edit XML Documents with Emacs and nXML
nXML mode for GNU Emacs provides a powerful environment for creating valid XML documents
[Discuss (0) | Link to this hack]

HACK
#13

Edit XML with Vim
With some special configuration, Vim can become a powerful XML editor
[Discuss (0) | Link to this hack]

HACK
#14

Edit XML Documents with Microsoft Word 2003
Edit, validate, and save XML documents with Microsoft Word 2003
[Discuss (0) | Link to this hack]

HACK
#15

Work with XML in Microsoft Excel 2003
Using table-structured data or spreadsheets? Open, format, and save XML documents with Excel 2003
[Discuss (0) | Link to this hack]

HACK
#16

Work with XML in Microsoft Access 2003
If you are a Microsoft Access user, you'll be happy to know that you can export Access 2003 data as XML
[Discuss (0) | Link to this hack]

HACK
#17

Convert Microsoft Office Files, Old or New, to XML
Use OpenOffice as a tool to convert Microsoft Office files to XML
[Discuss (0) | Link to this hack]

HACK
#18

Create an XML Document from a Text File with xmlspy
How do you get your old stuff into XML? Legacy text files can be translated into XML with xmlspy
[Discuss (0) | Link to this hack]

HACK
#19

Convert Text to XML with Uphill
This hack is a little different. It shows you how to convert plain text to XML using Dave Pawson's Java program, Uphill. Along the way, Dave also explains how and why he developed the software, which may be helpful for those developing their own text-to-XML packages in Java
[Discuss (0) | Link to this hack]

HACK
#20

Create Well-Formed XML with Minimal Manual Tagging Using an SGML Parser
Convert minimal markup into XML with James Clark's SP
[Discuss (0) | Link to this hack]

HACK
#21

Create an XML Document from a CSV File
Want to go from CSV to XML? Use Dave Pawson's CSVToXML tool to convert CSV files to XML with Java
[Discuss (0) | Link to this hack]

HACK
#22

Convert an HTML Document to XHTML with HTML Tidy
HTML Tidy was initially developed as a tool to clean up HTML, but it is an XML tool, too. This hack shows you how to use HTML Tidy to make your HTML into XHTML
[Discuss (0) | Link to this hack]

HACK
#23

Transform Documents with XQuery
XQuery is a new language under development by the W3C that's designed to query collections of XML data. XQuery provides a mechanism to efficiently and easily extract data from XML documents or from any data source that can be viewed as XML, such as relational databases
[Discuss (0) | Link to this hack]

HACK
#24

Execute an XQuery with Saxon
So you know how to write an XQuery? Great! But can you execute an XQuery? This hack shows you how
[Discuss (0) | Link to this hack]

HACK
#25

Include Text and Documents with Entities
You can insert external text and even documents into XML documents by using external entities
[Discuss (0) | Link to this hack]

HACK
#26

Include External Documents with XInclude
Beyond entity inclusion, there is another mechanism for including external text and documents. It's called XInclude
[Discuss (0) | Link to this hack]

HACK
#27

Encode XML Documents
Character encoding is quite important, especially as XML documents cross international boundaries. This hack will help you understand and use character encoding in XML
[Discuss (0) | Link to this hack]

HACK
#28

Explore XLink and XML
XLink and XML Base are implemented or partially implemented by Mozilla. This hack explores these technologies, using Mozilla as a platform
[Discuss (0) | Link to this hack]

HACK
#29

What's the Diff? Diff XML Documents
If you are handling many XML documents, sometimes you need to check the differences between two or more documents. You can perform diffs of XML documents with online and command-line tools
[Discuss (2) | Link to this hack]

HACK
#30

Look at XML Documents Through the Lens of the XML Information Set
If you get a grip on the XML Information Set, you'll know you don't have to worry about it too much.
[Discuss (0) | Link to this hack]

Transforming XML Documents

HACK
#31

Understand the Anatomy of an XSLT Stylesheet
Get acquainted with the basic elements of an XSLT stylesheet.
[Discuss (0) | Link to this hack]

HACK
#32

Transform an XML Document with a Command-Line Processor
Perform XSLT transformations at the command line.
[Discuss (0) | Link to this hack]

HACK
#33

Transform an XML Document Within a Graphical Editor
Transform XML documents with XSLT in a graphical environment
[Discuss (0) | Link to this hack]

HACK
#34

Analyze Nodes with TreeViewer
View nodes in an XML document according to the XPath 1.0 data model
[Discuss (0) | Link to this hack]

HACK
#35

Explore a Document Tree with the xmllint Shell
Explore the tree structure of an XML document with 's shell mode
[Discuss (0) | Link to this hack]

HACK
#36

View Documents as Tables Using Generic CSS or XSLT
While XML documents come in all shapes and sizes, a common pattern makes it very easy to present information stored in XML as a table
[Discuss (0) | Link to this hack]

HACK
#37

Generate an XSLT Identity Stylesheet with Relaxer
Quickly generate XSLT stylesheets with Asami Tomoharu's Relaxer
[Discuss (0) | Link to this hack]

HACK
#38

Pretty-Print XML Using a Generic Identity Stylesheet and Xalan
Sometimes your XML output from various programs is less than attractive. Spruce it up in a hurry with Xalan C++ and an identity transform
[Discuss (1) | Link to this hack]

HACK
#39

Create a Text File from an XML Document
Use this stylesheet to extract only the text from any XML document
[Discuss (0) | Link to this hack]

HACK
#40

Convert Attributes to Elements and Elements to Attributes
Transform elements into attributes and back the other way with XSLT
[Discuss (0) | Link to this hack]

HACK
#41

Convert XML to CSV
Turn your XML into CSV for use by applications that don't support XML
[Discuss (1) | Link to this hack]

HACK
#42

Create and Process SpreadsheetML
Since Excel XP, Excel has included an XML export option. SpreadsheetML provides an XML representation of your spreadsheets, complete with formatting and formula information
[Discuss (0) | Link to this hack]

HACK
#43

Choose Your Output Format in XSLT
Take control of the output of an XSLT stylesheet
[Discuss (0) | Link to this hack]

HACK
#44

Transform Your iTunes Library File
Grab data out of your iTunes library file and transform it into HTML
[Discuss (0) | Link to this hack]

HACK
#45

Generate Multiple Output Documents with XSLT 2.0
Unlike XSLT 1.0, XSLT 2.0 allows you to produce more than one result tree from a single transformation
[Discuss (0) | Link to this hack]

HACK
#46

Generate XML from MySQL
Using MySQL and want to use the data stored there elsewhere? Dump XML out of a MySQL database and then transform it with XSLT
[Discuss (0) | Link to this hack]

HACK
#47

Generate PDF Documents from XML and CSS
Produce PDF documents for XML documents styled with CSS using YesLogic Prince
[Discuss (0) | Link to this hack]

HACK
#48

Process XML Documents with XSL-FO and FOP
Use Apache's FOP engine together with XSL-FO to generate PDF output
[Discuss (0) | Link to this hack]

HACK
#49

Process HTML with XSLT Using TagSoup
Use TSaxon, a variant of Saxon, and TagSoup to help transform HTML
[Discuss (0) | Link to this hack]

HACK
#50

Build Results with Literal Result and Instruction Elements
Use literal result elements, literal text, and instruction elements in an XSLT stylesheet
[Discuss (0) | Link to this hack]

HACK
#51

Write Push and Pull Stylesheets
Understand the difference between push and pull XSLT stylesheets, and when to use which
[Discuss (0) | Link to this hack]

HACK
#52

Perform Math with XSLT
XPath 1.0 offers a number of math operations that can be performed within expressions
[Discuss (0) | Link to this hack]

HACK
#53

Transform XML Documents with grep and sed
Use and to transform XML instead of XSLT
[Discuss (1) | Link to this hack]

HACK
#54

Generate SVG with XSLT
With XSLT, you can create and alter SVG documents on the fly
[Discuss (0) | Link to this hack]

HACK
#55

Dither Scatterplots with XSLT and SVG
Use XSLT and SVG to offset points in X-Y scatterplots so they do not plot on top of each other
[Discuss (0) | Link to this hack]

HACK
#56

Use Lookup Tables with XSLT to Translate FIPS Codes
With XSLT, translate data in a source file by looking up the translation in a lookup table, using FIPS codes as an example
[Discuss (0) | Link to this hack]

HACK
#57

Grouping in XSLT 1.0 and 2.0
If your nodes are out of sorts in your source, use grouping to bring them into line
[Discuss (0) | Link to this hack]

HACK
#58

Use EXSLT Extensions
Use EXSLT extension functions to perform a variety of tasks not available in XSLT 1.0.
[Discuss (0) | Link to this hack]

XML Vocabularies

HACK
#59

Use XML Namespaces in an XML Vocabulary
Though controversial, XML namespaces are a necessity if you want to manage XML documents in the wild. This hack gets into some of the nitty-gritty of namespaces so you can more easily untangle them.
[Discuss (0) | Link to this hack]

HACK
#60

Create an RDDL Document
RDDL is a XHTML language extension that can help dispel a confusion that surrounds XML namespaces, and let people find out more about your vocabularies
[Discuss (0) | Link to this hack]

HACK
#61

Create and Validate an XHTML 1.0 Document
W3C has morphed HTML into XHTML, but they still splash around in the same gene pool
[Discuss (0) | Link to this hack]

HACK
#62

Create Books, Technical Manuals, and Papers in XML with DocBook
If you are writing a book, a manual, or a specification, DocBook provides an unsurpassed vocabulary for collecting your thoughts and words in XML form
[Discuss (0) | Link to this hack]

HACK
#63

Create a SOAP 1.2 Document
W3C's SOAP provides a way to package messages or requests in XML envelopes
[Discuss (0) | Link to this hack]

HACK
#64

Identify Yourself with FOAF
FOAF provides a framework for creating and publishing personal information in a machine-readable fashion. As you learn FOAF, you will also get acquainted in a practical way with RDF
[Discuss (0) | Link to this hack]

HACK
#65

Unravel the OpenOffice File Format
OpenOffice provides a suite of applications whose native file format consists of a set of XML files, compressed into a ZIP archive. This hack explores the basics of the OpenOffice file format
[Discuss (3) | Link to this hack]

HACK
#66

Render Graphics with SVG
With SVG, you can represent graphics as XML documents and render them in Internet Explorer with Adobe's SVG Viewer, in Netscape with Corel's SVG Viewer, in a branch of Mozilla that supports SVG, and in Batik's Squiggle
[Discuss (0) | Link to this hack]

HACK
#67

Use XForms in Your XML Documents
You may be accustomed to creating forms in HTML. XForms, an XML vocabulary, allows you to take a step up from HTML or XHTML forms.
[Discuss (0) | Link to this hack]

Defining XML Vocabularies with Schema Languages

HACK
#68

Validate an XML Document with a DTD
The Document Type Definition (DTD) is native to XML 1.0. You'll learn how to use DTDs in this hack.
[Discuss (0) | Link to this hack]

HACK
#69

Validate an XML Document with XML Schema
XML Schema is the W3C evolution of the DTD. It is complex but powerful, in wide use but not always popular. This hack will help you start writing schema in this format
[Discuss (0) | Link to this hack]

HACK
#70

Validate Multiple Documents Against an XML Schema at Once
A Xerces module allows you to validate more than one XML instance at a time against an XML Schema. This hack shows you how to use the Java class
[Discuss (0) | Link to this hack]

HACK
#71

Check the Integrity of a W3C Schema
Use the class from Xerces to do some extra checking on your schemas
[Discuss (0) | Link to this hack]

HACK
#72

Validate an XML Document with RELAX NG
Compared to the alternatives, RELAX NG schemas are easy to use and learn, and the more you use them the more you become convinced
[Discuss (0) | Link to this hack]

HACK
#73

Create a DTD from an Instance
If you need a DTD in a hurry, create it from an XML instance using Trang, Relaxer, DTDGenerator, or xmlspy
[Discuss (0) | Link to this hack]

HACK
#74

Create an XML Schema Document from an Instance or DTD
There are several tools that can help you generate an XML Schema document from either an instance or a DTD. This hack shows you how to get the job done with little fuss
[Discuss (0) | Link to this hack]

HACK
#75

Create a RELAX NG Schema from an Instance
Trang and Relaxer can create RELAX NG schemas on the fly, in either XML or compact syntax
[Discuss (0) | Link to this hack]

HACK
#76

Convert a RELAX NG Schema to XML Schema
If you like working with RELAX NG but you need XML Schema too, Trang is the answer. Trang converts RELAX NG schemas (in both XML and compact syntax) to XML Schema
[Discuss (0) | Link to this hack]

HACK
#77

Use RELAX NG and Schematron Together to Validate Business Rules
There are few issues regarding XML validation that cause as many headaches as validation of business rules (constraints on relations between element and attribute content in an XML document). This hack helps relieve that headache
[Discuss (0) | Link to this hack]

HACK
#78

Use RELAX NG to Generate DTD Customizations
RELAX NG enables you to create a customized subset or extension of a DTD much more easily than doing it the old-fashioned way
[Discuss (1) | Link to this hack]

HACK
#79

Generate Instances Based on Schemas
Use xmlspy or Sun's Instance Generator to create instances of DTDs or other schemas.
[Discuss (0) | Link to this hack]

RSS and Atom

HACK
#80

Subscribe to RSS Feeds
You've heard the XML buzz and you figure you ought to do something about it. This hack introduces you to RSS and shows you how to start subscribing to RSS feeds today.
[Discuss (0) | Link to this hack]

HACK
#81

Create an RSS 0.91 Document
Create an RSS 0.91 document using a template, and gain a little essential background in RSS history
[Discuss (0) | Link to this hack]

HACK
#82

Create an RSS 1.0 Document
Create an RSS 1.0 document using a template or with Java
[Discuss (0) | Link to this hack]

HACK
#83

Create an RSS 2.0 Document
Create an RSS 2.0 document from a template
[Discuss (0) | Link to this hack]

HACK
#84

Create an Atom Document
Atom is gaining ground as a feed format, and we should be paying attention to it. This hack guides you through the creation of an Atom document from a template
[Discuss (0) | Link to this hack]

HACK
#85

Validate RSS and Atom Documents
Use an online validator to check your RSS and Atom documents
[Discuss (0) | Link to this hack]

HACK
#86

Create RSS with XML::RSS
By using the popular syndication format known as RSS, you can use your newly scraped data in dozens of different aggregators, toolkits, and more
[Discuss (0) | Link to this hack]

HACK
#87

Syndicate Content with Movable Type
Movable Type (MT) is a very flexible personal publishing system that is extremely popular in the weblogging world. One of its most powerful features is its ability to output content into any markup form you may need for easy syndication of content
[Discuss (0) | Link to this hack]

HACK
#88

Post RSS Headlines on Your Site
Place other sites' syndicated headlines on your own pages, periodically
The Code
[Discuss (0) | Link to this hack]

HACK
#89

Create RSS 0.91 Feeds from Google
A .NET program can help you create an RSS feed based on a query to Google. You have to be a registered developer at Google to use this program, but the possibilities are rich
[Discuss (0) | Link to this hack]

HACK
#90

Syndicate a List of Books from Amazon with RSS and ASP
Someday all date will be available as RSS. Get a head start by syndicating Amazon search results.
The Code
[Discuss (0) | Link to this hack]

Advanced XML Hacks

HACK
#91

Pipeline XML with Ant
Ant is an extensible, open source build tool written in Java and sponsored by Apache. It can also be used as a framework for performing a large variety of operations - including XML-related tasks - in a single step.
[Discuss (0) | Link to this hack]

HACK
#92

Use Elements Instead of Entities to Avoid the "amp Explosion Problem"
Use replaceable elements as a solution to the "amp explosion problem.
[Discuss (0) | Link to this hack]

HACK
#93

Use Cocoon to Create a Well-Formed View of a Web Page, Then Scrape It for Data
Cocoon is a popular web development framework from Apache
[Discuss (0) | Link to this hack]

HACK
#94

From Wiki to XML, Through SGML
Wikis are nice for typing. XML is nice for processing. SGML is a standard language for specifying conversions from one to the other
[Discuss (0) | Link to this hack]

HACK
#95

Create Well-Formed XML with JavaScript
Use Javascript to ensure that you write correct, well-formed XML in web pages
[Discuss (2) | Link to this hack]

HACK
#96

Inspect and Edit XML Documents with the Document Object Model
The W3C Document Object Model was an early effort to gain fine-grained control over a document in memory. This hack introduces you to how DOM works
[Discuss (0) | Link to this hack]

HACK
#97

Processing XML with SAX
SAX is the de facto standard XML parser interface for Java. You learn how to use it here with a simple SAX application written in Java
[Discuss (0) | Link to this hack]

HACK
#98

Process XML with C#
Even if you aren't a C# programmer, you can get up to speed on processing XML with C# in short order with this hack
[Discuss (0) | Link to this hack]

HACK
#99

Generate Code from XML
Tools are readily available that can covert XML into Java code, which in turn can allow you to easily manipulate markup code rather than by hand in an editor or with something like XSLT. This hack walks you through an XML-to-code conversion scenario, and shows you how you can use the code after producing it.
[Discuss (0) | Link to this hack]

HACK
#100

Create Well-Formed XML with Genx
If you prefer the C language, Genx provides a fast, efficient C library for generating well-formed and canonical XML. On top of that, it's well documented and a real pleasure to use
[Discuss (0) | Link to this hack]


O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website: | Customer Service: | Book issues:

All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.