Any successful presentation, even a thoughtful tome, should have its text organized into an attractive, effective document. Organizing text into attractive documents is HTML and XHTML’s forte. The languages give you a number of tools that help you mold your text and get your message across. They also help structure your document so that your target audience has easy access to your words.
Always keep in mind while designing your documents (here we go again!) that the markup tags, particularly in regard to text, only advise — they do not dictate — how a browser will ultimately render the document. Rendering varies from browser to browser. Don’t get too entangled with trying to get just the right look and layout. Your attempts may and probably will be thwarted by the browser.
Like most text processors, a browser wraps the words it finds to fit the horizontal width of its viewing window. Widen the browser’s window, and words automatically flow up to fill the wider lines. Squeeze the window, and words wrap downward.
Unlike most text processors, however, HTML and XHTML use explicit
division (<div>
), paragraph
(<p>
), and line-break
(<br>
) tags to control the alignment and
flow of text. Return characters, although quite useful for
readability of the source document, typically are ignored by the
browser — authors must use the <br>
tag
to explicitly force a common text line break. The
<p>
tag, while also causing a line break,
carries with it meaning and effects beyond a simple return.
The <div>
tag is a little different. Originally
codified in the HTML 3.2 standard, <div>
was
included in the language to be a simple organizational tool — to
divide the document into discrete sections — whose somewhat
obtuse meaning meant few authors used it. But recent innovations
(alignment, styles, and the id
attribute for
document referencing and automation) now let you more distinctly
label and thereby define individual sections of your documents, as
well as control the alignment and appearance of those sections. These
features breathe real life and meaning into the
<div>
tag.
By associating an id
and
a class
name with the various sections of your
document, each delimited by a <div
id=name
class=name>
tag and
attributes (you can do the same with other tags, like
<p>
, too), you not only label those
divisions for later reference by a hyperlink and for automated
processing and management (collecting all the bibliography divisions,
for instance), but you may also define different, distinct display
styles for those portions of your document. For instance, you might
define one divisional class for your document’s
abstract (<div
class=abstract>
, for example), another for the
body, a third for the conclusion, and a fourth divisional class for
the bibliography (<div
class=biblio>
, for example).
Each class, then, might be given a different display definition in a
document-level or externally related style sheet: for example, the
abstract indented and in an italic typeface (such as
div.abstract
{left-margin:
+0.5in;
font-style:
italic}
);, the body in a left-justified roman
typeface, the conclusion similar to the abstract, and the
bibliography automatically numbered and formatted appropriately.
We provide a detailed description of style sheets, classes, and their applications in Chapter 8.
As
defined in the HTML 4.01 and XHTML 1.0 standards, the
<div>
tag divides your document into
separate, distinct sections. It may be used strictly as an
organizational tool, without any sort of formatting associated with
it, but it becomes more effective if you add the
id
and class
attributes to
label the divisions. The <div>
tag may also
be combined with the align
attribute to control
the alignment of whole sections of your document’s
content in the display and with the many programmatic
“on” attributes for user
interaction.
The
align
attribute for <div>
positions the enclosed
content to either the left
(default),
center
, or right
of the
display. In addition, you can specify justify
to
align both the left and right margins of the text. The
<div>
tag may be nested, and the alignment
of the nested <div>
tag takes precedence
over the containing <div>
tag. Further,
other nested alignment tags, such as
<center>
, aligned paragraphs (see
<p>
in Section 4.1.2), or specially aligned table rows and cells
override the effect of <div>
. Like the
align
attribute for other tags, it is deprecated
in the HTML and XHTML standards in deference to style sheet-based
layout controls.
Supported only by
Internet Explorer, the
nowrap
attribute suppresses automatic word
wrapping of the text within the division. Line breaks will occur only
where you have placed carriage returns in your source document.
While the nowrap
attribute probably
doesn’t make much sense for large sections of text
that would otherwise be flowed together on the page, it can make
things a bit easier when creating blocks of text with many explicit
line breaks: poetry, for example, or addresses. You
don’t have to insert all those explicit
<br>
tags in a text flow within a
<div
nowrap>
tag. On the
other hand, all other browsers ignore the nowrap
attribute and merrily flow your text together anyway. If you are
targeting only Internet Explorer with your documents, consider using
nowrap
where needed, but otherwise, we
can’t recommend this attribute for general use.
The dir
attribute lets you advise the browser which direction the text should
be displayed in, and the
lang
attribute lets you specify the language used within the division.
[Section 3.6.1.1] [Section 3.6.1.2]
Use the id
attribute to label the document division specially for later
reference by a hyperlink, style sheet, applet, or other automated
process. An acceptable id
value is any
quote-enclosed string that uniquely identifies the division and that
later can be used to reference that document section unambiguously.
Although we’re introducing it within the context of
the <div>
tag, this attribute can be used
with almost any tag.
When used as an element label, the value of the id
attribute can be added to a URL to address the labelled element
uniquely within the document. You can label both large portions of
content (via a tag like <div>
) and small
snippets of text (using a tag like <i>
or
<span>
). For example, you might label the
abstract of a technical report using <div
id="abstract">
. A URL could jump right to that
abstract by referencing report.html#abstract
. When
used in this manner, the value of the id
attribute
must be unique with respect to all other id
attributes within the document and all the names defined by any
<a>
tags with the name
attribute. [Section 6.3.3]
When used as a style-sheet selector, the value of the
id
attribute is the name of a style rule that can
be associated with the current tag. This provides a second set of
definable style rules, similar to the various style classes you can
create. A tag can use both the class
and
id
attributes to apply two different rules to a
single tag. In this usage, the name associated with the
id
attribute must be unique with respect to all
other style IDs within the current document. A more complete
description of style classes and IDs can be found in Chapter 8.
Use the optional
title
attribute and quote-enclosed string value to associate a descriptive
phrase with the division. Like the id
attribute,
the title
attribute can be used with almost any
tag and behaves similarly for all tags.
There is no defined usage for the value of the
title
attribute, and many browsers simply ignore
it. Internet Explorer, however, will
display the title associated with any element when the mouse pauses
over that element. Used correctly, the title
attribute could be used in this manner to provide spot help for the
various elements within your document.
Use the
style
attribute with the <div>
tag to create an
inline style for the content enclosed by the tag. The
class
attribute lets you apply the style of a
predefined class of the <div>
tag to the
contents of this division. The value of the class
attribute is the name of a style defined in some document-level or
externally defined style sheet. In addition, class-identified
divisions lend themselves well for computer processing of your
documents; for example, extracting all divisions with the class name
“biblio,” for the automated
assembly of a master bibliography. [Section 8.1.1] [Section 8.3]
The many user-related events that may happen in and around a division, such as when a user clicks or double-clicks the mouse within its display space, are recognized by the browser if it conforms to the current HTML or XHTML standard. With the respective “on” attribute and value, you may react to those events by displaying a user dialog box or activating some multimedia event. [Section 12.3.3]
The <p>
tag signals the start of a
paragraph. That’s not well known even by some
veteran webmasters, because it runs counterintuitive to what
we’ve come to expect from experience. Most word
processors we’re familiar with use just one special
character, typically the return character, to signal the
end of a paragraph. In HTML and XHTML, each
paragraph should start with <p>
and end with
the corresponding </p>
tag. And while a
sequence of newline characters in a text processor-displayed document
creates an empty paragraph for each one, browsers typically ignore
all but the first paragraph tag.
In practice, with HTML you can ignore the starting
<p>
tag at the beginning of the first
paragraph and the </p>
tags at the ends of
each paragraph: they can be implied from other tags that occur in the
document and hence safely omitted.[21] For example:
<body> This is the first paragraph, at the very beginning of the body of this document. <p> The tag above signals the start of this second paragraph. When rendered by a browser, it will begin slightly below the end of the first paragraph, with a bit of extra whitespace between the two paragraphs. <p> This is the last paragraph in the example. </body>
Notice that we haven’t included the paragraph start
tag (<p>
) for the first paragraph or any end
paragraph tags; they can be unambiguously inferred by the browser and
are therefore unnecessary.
In general, you’ll find that human document authors
tend to omit postulated tags whenever possible, while automatic
document generators tend to insert them. That may be because the
software designers didn’t want to run the risk of
having their products chided by competitors as not adhering to the
HTML standard, even though we’re splitting
letter-of-the-law hairs here. Go ahead and be defiant: omit that
first paragraph’s <p>
tag
and don’t give a second thought to paragraph-ending
</p>
tags — provided, of course, that
your document’s structure and clarity are not
compromised (that is, as long as you are aware that XHTML frowns
severely on such laxity).
When encountering a new paragraph
(<p>
) tag, the browser typically inserts one
blank line plus some extra vertical space into the display before
starting the new paragraph. The browser then collects all the words
and, if present, inline images into the new paragraph, ignoring
leading and trailing spaces (not spaces between words, of course) and
return characters in the source text. The browser software then flows
the resulting sequence of words and images into a paragraph that fits
within the margins of its display window, automatically generating
line breaks as needed to wrap the text within the window. For
example, compare how a browser arranges the text into lines and
paragraphs (Figure 4-1) to how the preceding
example is printed on the page. The browser may also automatically
hyphenate long words, and the paragraph may be full-justified to
stretch the line of words out toward both margins.
The net result is that you do not have to worry about line length, word wrap, and line breaks when composing your documents. The browser will take any arbitrary sequence of words and images and display a nicely formatted paragraph.
If you want to control line length and breaks explicitly, consider
using a preformatted text block with the
<pre>
tag. If you need to force a line
break, use the <br>
tag. [<pre>] [Section 4.6.1]
Most
browsers automatically left-justify a new paragraph. To change this
behavior, HTML 4 and XHTML give you the align
attribute for the <p>
tag and provide four
kinds of content justification: left
,
right
, center
, or
justify
.
Figure 4-2 shows you the effect of various alignments as rendered from the following source:
<p align=right> Right over here! <br> This is too. <p align=left> Slide back left. <p align=center> Smack in the middle. </p> Left is the default.
Notice in the HTML example that the paragraph alignment remains in
effect until the browser encounters another
<p>
tag or an ending
</p>
tag. We deliberately left out a final
<p>
tag in the example to illustrate the
effects of the </p>
end tag on paragraph
justification. Other body elements — including forms, headers,
tables, and most other body content-related tags — may also
disrupt the current paragraph alignment and cause subsequent
paragraphs to revert to the default left alignment.
Note that the
align
attribute is deprecated in HTML 4 and XHTML, in deference to style
sheet-based alignments.
The dir
attribute lets you advise the browser
which direction the text within the paragraph should be displayed in,
and the lang
attribute lets you specify the language used within that paragraph.
The dir
and lang
attributes are
supported by the popular browsers, even though there are no behaviors
defined for any specific language. [Section 3.6.1.1] [Section 3.6.1.2]
Use the id
attribute to create a label for the paragraph that can later be used
to unambiguously reference that paragraph in a hyperlink target, for
automated searches, as a style-sheet selector, and with a host of
other applications. [Section 4.1.1.4]
Use the optional
title
attribute and quote-enclosed string value to provide a descriptive
phrase for the paragraph. [Section 4.1.1.4]
Use the style
attribute with the <p>
tag to create an
inline style for the paragraph’s contents. The
class
attribute lets you label the paragraph with
a name that refers to a predefined class of the
<p>
tag previously declared in some
document-level or externally defined style sheet. Class-identified
paragraphs lend themselves well to computer processing of your
documents — for example, extracting all paragraphs whose class
name is “citation,” for automated
assembly of a master list of citations. [Section 8.1.1]
[Section 8.3]
As with divisions, there are many user-initiated events, such as when a user clicks or double-clicks within a tag’s display space, that are recognized by the browser if it conforms to the current HTML or XHTML standard. With the respective “on” attribute and value, you may react to those events by displaying a user dialog box or activating some multimedia event. [Section 12.3.3]
A paragraph may contain any element allowed in a text flow, including
conventional words and punctuation, links
(<a>
), images
(<img>
), line breaks
(<br>
), font changes
(<b>
, <i>
,
<tt>
, <u>
,
<strike>
, <big>
,
<small>
, <sup>
,
<sub>
, and <font>
),
and content-based style changes (<acronym>
,
<cite>
, <code>
,
<dfn>
, <em>
,
<kbd>
, <samp>
,
<strong>
, and
<var>
). If any other element occurs within
the paragraph, it implies that the paragraph has ended, and the
browser assumes that the closing </p>
tag
was not specified.
You may specify a paragraph only within a block, along with other paragraphs, lists, forms, and preformatted text. In general, this means that paragraphs can appear where a flow of text is appropriate, such as in the body of a document, in an element in a list, and so on. Technically, paragraphs cannot appear within a header, anchor, or other element whose content is strictly text-only. In practice, most browsers ignore this restriction and format the paragraph as a part of the containing element.
Get HTML & XHTML: The Definitive Guide, 5th Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.