
|
|
|
Edit XML Documents with Emacs and nXML
nXML mode for GNU Emacs provides a powerful
environment for creating valid XML documents
[Discuss (0) | Link to this hack] |
If you've been editing
XML from within GNU Emacs using PSGML, here's a tip:
get rid of it. That's right, tear it out, dump it,
make it disappear—because there's a much
better tool available: nXML. (Grab the latest
nxml-mode-200nnnnn.tar.gz file from http://www.thaiopensource.com/download/.)
nXML was developed by James Clark, the man who brought us groff,
expat, sgmls, SP, and Jade, as well as being a driving force behind
the development of XPath, XSLT (and before that, DSSSL), and, along
with Murata Makoto, RELAX NG (http://www.relaxng.org/).
Which brings us back to what nXML is all about: nXML is a very clever
mechanism for doing RELAX NG-driven, context-sensitive, validated
editing. What's particularly clever about it is
that, unlike PSGML and unlike virtually every other XML editing
application available—with the exception of the Topologi
Collaborative Markup Editor (http://www.topologi.com/products/tme/)—it
provides real-time, automatic visual identification of validity
errors.
This hack assumes that you are familiar with Emacs. The
README file that comes with nXML states that you
must use Emacs version 21.x (preferably 21.3 or
later) in order to use nXML. To get nXML to run in Emacs, you must
first load the rng-auto.el file. In Emacs, type:
M-x load-file
Then load the file rng-auto.el from the location
where you downloaded and extracted the latest version of nXML. This
file defines the autoloads for nXML. Now open an XML document (C-x
C-f) and enter:
M-x nxml-mode
You are good to go! For help, type:
C-h m
Spotting Validity Errors in Real Time
What "automatic visual identification of validity
errors" means is that if you create and edit
documents using nXML, you never need to manually run a separate
validation step to determine whether a document is valid; i.e., if a
document contains a validity error, you will know instantly as you
edit the document because it will be visually flagged.
Here's how it works. As you're
editing a document:
-
nXML incrementally reparses and revalidates the document in the
background during idle periods between the times when you are
actually typing in content. You can wait for nXML to finish
validating the entire document (which usually takes only a matter of
seconds), or if you're working with a large
document, you don't need to wait: the moment you
start typing in content, nXML will stop its background parsing and
validating until you're idle once again.
-
nXML describes the current validity state in the mode line at the
bottom of the Emacs interface; at any point while
you're editing a document, the mode line will say
either Valid, Invalid, or
Validated nn%, where nn is a
number indicating what percentage of the document has been validated
so far.
-
nXML visually highlights all instances of invalidity it finds in the
part of the document it has validated so far (by default, the value
of the Emacs face it uses is a red underline, but the highlighting
can be changed by customizing that face).
If you mouse over or move your cursor over one of the points that
nXML has highlighted as invalid, text appears describing the validity
error, either as popup text or in the minibuffer echo area at the
bottom of the Emacs interface.
Figure 1. nXML validation error message
Getting Help with nXML
To get oriented with the basics of editing within nXML:
-
Type C-h (or M-x describe-mode) for quick help with nXML commands and
key bindings.
-
For more extensive documentation, access the nXML manual (in texinfo
format) by typing M-x info.
-
Make sure to read the NEWS file in the nXML distribution; it probably
contains some late information that hasn't yet made
its way into the nXML manual.
Using Context-Sensitive Completion
The nXML mechanism for doing context-sensitive insertion/completion
of markup is similar to the mechanism that PSGML provides. With nXML,
you:
-
Place your cursor at some point in a document.
-
Type a keyboard combination (in the nXML case, C-Return) to do
context-sensitive checking to see what markup (elements, attributes,
or enumerated attribute values) is valid at that point in the
document; Emacs then opens up a completion buffer containing a list
of the valid markup choices.
Figure 2. nXML context-sensitive completion
Making nXML Work Your Way
To fine-tune the behavior of nXML:
-
Explore nXML's extensive, well-documented set of
customization options by typing M-x customize-group nXML.
-
Even if you change no other nXML option, try setting the value of the
Nxml Sexp Element Flag option (nxml-sexp-element-flag variable) to on
(non-nil). The default value (nil) means that Emacs sexp
commands—for example, C-M-k (kill-sexp)—operate on tags.
What you probably want instead is for them to operate on elements,
which is what turning on the Nxml Sexp Element Flag option will do
for you.
-
Spend some time experimenting with the syntax-highlighting options;
nXML provides what must be by far the best and most configurable
syntax-highlighting capabilities of any XML editing application
currently available. Over 30 customizable Emacs faces enable you to
independently control color and character formatting of everything
from the level of element and attribute names down to the level of
different types of markup delimiters (e.g., angle-bracket tag
delimiters, the quote marks around attribute values, etc.).
Entering and Displaying Special Characters
Another area where nXML is very clever is the way in which it enables
you to enter and display special characters. To enter a special
character, such as a copyright sign:
-
Type C-c C-u. nXML then prompts you for the name of the character to
enter.
-
Type the first few letters of the character name and then hit tab.
nXML then does completion, presenting you with a list of all
character names that start with the letters you type in. For example,
if you enter cop, nXML will present you with a
list of several character names that starts with
COPTIC, along with the name of the character
that's probably the one you're
looking for: COPYRIGHT SIGN.
-
Either use your mouse to select one of the choices from the
completion buffer, or type more letters then tab again to narrow down
the choices to the character you need. Or, if you just type
copy to begin with, you'll get
straight to the copyright sign (because it's the
only character name that begins with COPY).
Note that, by default, nXML inserts the hexadecimal character entity
reference, not the actual character; e.g., for the copyright sign,
nXML inserts the character reference ©.
This ensures that you will be able to interpret what the character is
if it is displayed by software that does not understand Unicode.
But this is where things get interesting: even though nXML writes
only the numeric character reference to the file, it displays the
glyph for the character (along with the character reference itself).
And if you mouse over the character reference, nXML displays the full
name of the character, either as pop-up text or in the minibuffer
echo area at the bottom of the Emacs interface ().
Figure 3. nXML display of special characters
As far as special characters go, nXML lets you have your cake and eat
it too. You get:
-
An easy way to enter special characters as character references,
without needing to memorize or look up their numeric values or ISO
entity names.
-
The ability to see glyphs and full names for all the character
references in your documents, while still being able to distribute
them to others as ASCII-encoded files (so you're not
depending on others having editors that support Unicode or some other
encoding).
To enter special characters in other ways:
-
Instead of typing C-c C-u to get prompted for a character name, type
C-u C-c C-u. You'll go through the same completion
process to enter the name, but when you're done,
nXML will insert the character directly, instead of inserting the
character reference. GNU Emacs 21.x or later supports display of
Unicode and many other encodings (as long as you have the fonts), so
you don't have to avoid inserting characters
directly unless you need to share your source documents with others
who might not have Unicode-enabled editors.
-
Try Norm Walsh's XML Unicode Lisp package
(http://nwalsh.com/emacs/xmlchars/). Among
other things, it automatically inserts
"smart" quotes in just the same way
that most word-processing applications do, along with a smart
em-dash/en-dash feature. It also provides a menu-driven mechanism for
entering special characters, so you don't need to
type and do completion; instead, you just select a character name
from a menu. Compatibility with nXML's native
character-insertion mechanism isn't a
problem—the two coexist with one another quite happily.
See also:
|
O'Reilly Home | Privacy Policy

© 2007 O'Reilly Media, Inc.
Website:
| Customer Service:
| Book issues:
All trademarks and registered trademarks appearing on oreilly.com are the property of their respective owners.
|
|