ASCII
Most character sets in common use today are supersets of ASCII. That is, code points 0 through 127 are assigned to the same characters to which ASCII assigns them. Table 27-2 lists the ASCII character set. The only notable exceptions are the EBCDIC-derived character sets. Specifically, Unicode is a superset of ASCII, and code points 1 through 127 identify the same characters in Unicode as they do in ASCII.
|
|
Characters 0 through 31 and character 127 are nonprinting
control characters, sometimes called the C0 controls to distinguish them from the C1 controls
used in the ISO-8859 character sets. Of these 33 characters, only
the carriage return, line feed, and horizontal tab may appear in XML
documents. The other 30 may not appear anywhere in an XML document,
including in tags, comments, or parsed character data. In XML 1.1
(but not XML 1.0), 29 of these 30 characters (all of them except
NUL) can be inserted with character references, such as 
.
Get XML in a Nutshell, 3rd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.