BUY THIS BOOK
Add to Cart

Print Book $49.99


Add to Cart

Print+PDF $64.99

Add to Cart

PDF $39.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £35.50

What is this?

Looking to Reprint or License this content?


DNS and BIND
DNS and BIND, Fifth Edition By Paul Albitz, Cricket Liu
May 2006
Pages: 640

Cover | Table of Contents


Table of Contents

Chapter 1: Background
The White Rabbit put on his spectacles. "Where shall I begin, please your Majesty?" he asked.
"Begin at the beginning," the King said, very gravely, "and go on till you come to the end: then stop."
It's important to know a little ARPAnet history to understand the Domain Name System (DNS). DNS was developed to address particular problems on the ARPAnet, and the Internet—a descendant of the ARPAnet—is still its main user.
If you've been using the Internet for years, you can probably skip this chapter. If you haven't, we hope it'll give you enough background to understand what motivated the development of DNS.
In the late 1960s, the U.S. Department of Defense's Advanced Research Projects Agency, ARPA (later DARPA), began funding the ARPAnet, an experimental wide area computer network that connected important research organizations in the United States. The original goal of the ARPAnet was to allow government contractors to share expensive or scarce computing resources. From the beginning, however, users of the ARPAnet also used the network for collaboration. This collaboration ranged from sharing files and software and exchanging electronic mail—now commonplace—to joint development and research using shared remote computers.
The Transmission Control Protocol/Internet Protocol (TCP/IP) protocol suite was developed in the early 1980s and quickly became the standard host-networking protocol on the ARPAnet. The inclusion of the protocol suite in the University of California at Berkeley's popular BSD Unix operating system was instrumental in democratizing internetworking. BSD Unix was virtually free to universities. This meant that internetworking—and ARPAnet connectivity—were suddenly available cheaply to many more organizations than were previously attached to the ARPAnet. Many of the computers being connected to the ARPAnet were being connected to local area networks (LANs), too, and very shortly the other computers on the LANs were communicating via the ARPAnet as well.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
A (Very) Brief History of the Internet
In the late 1960s, the U.S. Department of Defense's Advanced Research Projects Agency, ARPA (later DARPA), began funding the ARPAnet, an experimental wide area computer network that connected important research organizations in the United States. The original goal of the ARPAnet was to allow government contractors to share expensive or scarce computing resources. From the beginning, however, users of the ARPAnet also used the network for collaboration. This collaboration ranged from sharing files and software and exchanging electronic mail—now commonplace—to joint development and research using shared remote computers.
The Transmission Control Protocol/Internet Protocol (TCP/IP) protocol suite was developed in the early 1980s and quickly became the standard host-networking protocol on the ARPAnet. The inclusion of the protocol suite in the University of California at Berkeley's popular BSD Unix operating system was instrumental in democratizing internetworking. BSD Unix was virtually free to universities. This meant that internetworking—and ARPAnet connectivity—were suddenly available cheaply to many more organizations than were previously attached to the ARPAnet. Many of the computers being connected to the ARPAnet were being connected to local area networks (LANs), too, and very shortly the other computers on the LANs were communicating via the ARPAnet as well.
The network grew from a handful of hosts to tens of thousands of hosts. The original ARPAnet became the backbone of a confederation of local and regional networks based on TCP/IP, called the Internet.
In 1988, however, DARPA decided the experiment was over. The Department of Defense began dismantling the ARPAnet. Another network, the NSFNET, funded by the National Science Foundation, replaced the ARPAnet as the backbone of the Internet.
In the spring of 1995, the Internet made a transition from using the publicly funded NSFNET as a backbone to using multiple commercial backbones, run by telecommunications companies such as SBC and Sprint, and long-time commercial internetworking players such as MFS and UUNET.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
On the Internet and Internets
A word on "the Internet," and on "internets" in general, is in order. In print, the difference between the two seems slight: one is always capitalized, one isn't. The distinction between their meanings, however, is significant. The Internet, with a capital "I," refers to the network that began its life as the ARPAnet and continues today as, roughly, the confederation of all TCP/IP networks directly or indirectly connected to commercial U.S. backbones. Seen up close, it's actually quite a few different networks—commercial TCP/IP backbones, corporate and U.S. government TCP/IP networks, and TCP/IP networks in other countries—interconnected by high-speed digital circuits. A lowercase internet, on the other hand, is simply any network made up of multiple smaller networks using the same internetworking protocols. An internet (little "i") isn't necessarily connected to the Internet (big "I"), nor does it necessarily use TCP/IP as its internetworking protocol. There are isolated corporate internets, for example.
An intranet, with a little i, is really just a TCP/IP-based internet, used to emphasize the use of technologies developed and introduced on the Internet on a company's internal corporate network. An extranet, on the other hand, is a TCP/IP-based internet that connects partner companies, or a company to its distributors, suppliers, and customers.
Through the 1970s, the ARPAnet was a small, friendly community of a few hundred hosts. A single file, HOSTS.TXT, contained a name-to-address mapping for every host connected to the ARPAnet. The familiar Unix host table, /etc/hosts, was compiled from HOSTS.TXT (mostly by deleting fields Unix didn't use).
HOSTS.TXT was maintained by SRI's Network Information Center (dubbed "the NIC") and distributed from a single host, SRI-NIC. ARPAnet administrators typically emailed their changes to the NIC, and periodically FTP'ed to SRI-NIC and grabbed the current HOSTS.TXT file. Their changes were compiled into a new
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Domain Name System, in a Nutshell
The Domain Name System is a distributed database. This structure allows local control of the segments of the overall database, yet data in each segment is available across the entire network through a client/server scheme. Robustness and adequate performance are achieved through replication and caching.
Programs called nameservers constitute the server half of DNS's client/server mechanism. Nameservers contain information about some segments of the database and make that information available to clients, called resolvers . Resolvers are often just library routines that create queries and send them across a network to a nameserver.
The structure of the DNS database, shown in Figure 1-1, is similar to the structure of the Unix filesystem. The whole database (or filesystem) is pictured as an inverted tree, with the root node at the top. Each node in the tree has a text label, which identifies the node relative to its parent. This is roughly analogous to a "relative pathname" in a filesystem, like bin. One label—the null label, or " "—is reserved for the root node. In text, the root node is written as a single dot (.). In the Unix filesystem, the root is written as a slash (/).
Figure 1-1: The DNS database versus a Unix filesystem
Each node is also the root of a new subtree of the overall tree. Each of these subtrees represents a partition of the overall database—a directory in the Unix filesystem, or a domain in the Domain Name System. Each domain or directory can be further divided into additional partitions, called subdomains in DNS, like a filesystem's subdirectories. Subdomains, like subdirectories, are drawn as children of their parent domains.
Every domain has a unique name, like every directory. A domain's
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The History of BIND
The first implementation of the Domain Name System was called JEEVES, written by Paul Mockapetris himself. A later implementation was BIND, an acronym for Berkeley Internet Name Domain, written by Kevin Dunlap for Berkeley's 4.3 BSD Unix. BIND is now maintained by the Internet Systems Consortium.
BIND is the implementation we'll concentrate on in this book and is by far the most popular implementation of DNS today. It has been ported to most flavors of Unix and is shipped as a standard part of most vendors' Unix offerings. BIND has even been ported to Microsoft's Windows NT, Windows 2000, and Windows Server 2003.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Must I Use DNS?
Despite the usefulness of the Domain Name System, there are some situations in which it doesn't pay to use it. There are other name-resolution mechanisms besides DNS, some of which may be a standard part of your operating system. Sometimes the overhead involved in managing zones and their nameservers outweighs the benefits. On the other hand, there are circumstances in which you have no other choice but to set up and manage nameservers. Here are some guidelines to help you make that decision:
If you're connected to the Internet . . .
. . . DNS is a must. Think of DNS as the lingua franca of the Internet: nearly all of the Internet's network services use DNS. That includes the Web, electronic mail, remote terminal access, and file transfer.
On the other hand, this doesn't necessarily mean that you have to set up and run zones by yourself for yourself. If you've only got a handful of hosts, you may be able to join an existing zone (see Chapter 3) or find someone else to host your zones for you. If you pay an Internet service provider for your Internet connectivity, ask if it'll host your zone for you, too. Even if you aren't already a customer, there are companies that will help out, for a price.
If you have a little more than a handful of hosts, or a lot more, you'll probably want your own zone. And if you want direct control over your zone and your nameservers, you'll want to manage it yourself. Read on!
If you have your own TCP/IP-based internet . . .
. . . you probably want DNS. By an internet, we don't mean just a single Ethernet of workstations using TCP/IP (see the next section if you thought that was what we meant); we mean a fairly complex "network of networks." Maybe you have several dozen Ethernet segments connected via routers, for example.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: How Does DNS Work?
" . . . and what is the use of a book," thought Alice, "without pictures or conversations?"
The Domain Name System is basically a database of host information. Admittedly, you get a lot with that: funny dotted names, networked nameservers, a shadowy "namespace." But keep in mind that, in the end, the service DNS provides is information about internet hosts.
We've already covered some important aspects of DNS, including its client/server architecture and the structure of the DNS database. However, we haven't gone into much detail, and we haven't explained the nuts and bolts of DNS's operation.
In this chapter, we'll explain and illustrate the mechanisms that make DNS work. We'll also introduce the terms you'll need to know to read the rest of the book (and to converse intelligently with your fellow zone administrators).
First, though, let's take a more detailed look at the concepts introduced in the previous chapter. We'll try to add enough detail to spice it up a little.
DNS's distributed database is indexed by domain names. Each domain name is essentially just a path in a large inverted tree, called the domain namespace. The tree's hierarchical structure, shown in Figure 2-1, is similar to the structure of the Unix filesystem. The tree has a single root at the top. In the Unix filesystem, this is called the root directory and is represented by a slash (/). DNS simply calls it "the root." Like a filesystem, DNS's tree can branch any number of ways at each intersection point, or node. The depth of the tree is limited to 127 levels (a limit you're not likely to reach).
Figure 2-1: The structure of the DNS namespace
Each node in the tree has a text label (without dots) that can be up to 63 characters long. A null (zero-length) label is reserved for the root. The full domain name of any node in the tree is the sequence of labels on the path from that node to the root. Domain names are always read from the node toward the root ("up" the tree), with dots separating the names in the path.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Domain Namespace
DNS's distributed database is indexed by domain names. Each domain name is essentially just a path in a large inverted tree, called the domain namespace. The tree's hierarchical structure, shown in Figure 2-1, is similar to the structure of the Unix filesystem. The tree has a single root at the top. In the Unix filesystem, this is called the root directory and is represented by a slash (/). DNS simply calls it "the root." Like a filesystem, DNS's tree can branch any number of ways at each intersection point, or node. The depth of the tree is limited to 127 levels (a limit you're not likely to reach).
Figure 2-1: The structure of the DNS namespace
Each node in the tree has a text label (without dots) that can be up to 63 characters long. A null (zero-length) label is reserved for the root. The full domain name of any node in the tree is the sequence of labels on the path from that node to the root. Domain names are always read from the node toward the root ("up" the tree), with dots separating the names in the path.
If the root node's label actually appears in a node's domain name, the name looks as though it ends in a dot, as in "www.oreilly.com." (It actually ends with a dot—the separator—and the root's null label.) When the root node's label appears by itself, it is written as a single dot, ".", for convenience. Consequently, some software interprets a trailing dot in a domain name to indicate that the domain name is absolute. An absolute domain name is written relative to the root and unambiguously specifies a node's location in the hierarchy. An absolute domain name is also referred to as a fully qualified domain name, often abbreviated FQDN. Names without trailing dots are sometimes interpreted as relative to some domain name other than the root, just as directory names without a leading slash are often interpreted as relative to the current directory.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
The Internet Domain Namespace
So far, we've talked about the theoretical structure of the domain namespace and what sort of data is stored in it, and we've even hinted at the types of names you might find in it with our (sometimes fictional) examples. But this won't help you decode the domain names you see on a daily basis on the Internet.
The Domain Name System doesn't impose many rules on the labels in domain names, and it doesn't attach any particular meaning to the labels at a given level of the namespace. When you manage a part of the domain namespace, you can decide on your own semantics for your domain names. Heck, you could name your subdomains A through Z, and no one would stop you (though they might strongly recommend against it).
The existing Internet domain namespace, however, has some self-imposed structure to it. Especially in the upper-level domains, the domain names follow certain traditions (not rules, really, because they can be and have been broken). These traditions help to keep domain names from appearing totally chaotic. Understanding these traditions is an enormous asset if you're trying to decipher a domain name.
The original top-level domains divided the Internet domain namespace organizationally into seven domains:
com
Commercial organizations, such as Hewlett-Packard (hp.com), Sun Microsystems (sun.com), and IBM (ibm.com).
edu
Educational organizations, such as U.C. Berkeley (berkeley.edu) and Purdue University (purdue.edu).
gov
Government organizations, such as NASA (nasa.gov) and the National Science Foundation (
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Delegation
Remember that one of the main goals of the design of the Domain Name System was to decentralize administration? This is achieved through delegation. Delegating domains works a lot like delegating tasks at work. A manager may break up a large project into smaller tasks and delegate responsibility for each of these tasks to different employees.
Likewise, an organization administering a domain can divide it into subdomains. Each subdomain can be delegated to other organizations, which means that an organization becomes responsible for maintaining all the data in that subdomain. It can freely change the data and even divide its subdomain into more subdomains and delegate those. The parent domain retains only pointers to sources of the subdomain's data, so that it can refer queriers there. The domain stanford.edu, for example, is delegated to the folks at Stanford who run the university's networks (Figure 2-7).
Figure 2-7: stanford.edu is delegated to Stanford University
Not all organizations delegate away their whole domain, just as not all managers delegate all their work. A domain may have several delegated subdomains and contain hosts that don't belong in the subdomains. For example, the Acme Corporation (it supplies a certain coyote with most of his gadgets), which has a division in Rockaway and its headquarters in Kalamazoo, might have a rockaway.acme.com subdomain and a kalamazoo.acme.com subdomain. However, the few hosts in the Acme sales offices scattered throughout the United States would fit better under acme.com than under either subdomain.
We'll explain how to create and delegate subdomains later. For now, it's important only that you understand that the term delegation refers to assigning responsibility for a subdomain to another organization.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Nameservers and Zones
The programs that store information about the domain namespace are called nameservers. Nameservers generally have complete information about some part of the domain namespace, called a zone, which they load from a file or from another nameserver. The nameserver is then said to have authority for that zone. Nameservers can be authoritative for multiple zones, too.
The difference between a zone and a domain is important, but subtle. All top-level domains and many domains at the second level and lower, such as berkeley.edu and hp.com, are broken into smaller, more manageable units by delegation. These units are called zones. The edu domain, shown in Figure 2-8, is divided into many zones, including the berkeley.edu zone, the purdue.edu zone, and the nwu.edu zone. At the top of the domain, there's also an edu zone. It's natural that the folks who run edu would break up the edu domain: otherwise, they'd have to manage the berkeley.edu subdomain themselves. It makes much more sense to delegate berkeley.edu to Berkeley. What's left for the folks who run edu? The edu zone, which contains mostly delegation information for the subdomains of edu.
Figure 2-8: The edu domain broken into zones
The berkeley.edu subdomain is, in turn, broken up into multiple zones by delegation, as shown in Figure 2-9. There are delegated subdomains called cc, cs, ce, me, and more. Each subdomain is delegated to a set of nameservers, some of which are also authoritative for berkeley.edu. However, the zones are still separate and may have totally different groups of authoritative nameservers.
Figure 2-9: The berkeley.edu domain broken into zones
A zone contains all the domain names the domain with the same domain name contains, except for domain names in delegated subdomains. For example, the top-level domain
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Resolvers
Resolvers are the clients that access nameservers. Programs running on a host that need information from the domain namespace use the resolver. The resolver handles:
  • Querying a nameserver
  • Interpreting responses (which may be resource records or an error)
  • Returning the information to the programs that requested it
In BIND, the resolver is a set of library routines that is linked to programs such as ssh and ftp. It's not even a separate process. The resolver relies almost entirely on the nameservers it queries: it has the smarts to put together a query, to send it and wait for an answer, and to resend the query if it isn't answered, but that's about all. Most of the burden of finding an answer to the query is placed on the nameserver. The DNS specs call this kind of resolver a stub resolver.
Other implementations of DNS have had smarter resolvers that could do more sophisticated things that had more advanced capabilities, such as following referrals to locate the nameservers authoritative for a particular zone.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Resolution
Nameservers are adept at retrieving data from the domain namespace. They have to be, given the limited intelligence of most resolvers. Not only can they give you data about zones for which they're authoritative, they can also search through the domain namespace to find data for which they're not authoritative. This process is called name resolution, or simply resolution.
Because the namespace is structured as an inverted tree, a nameserver needs only one piece of information to find its way to any point in the tree: the domain names and addresses of the root nameservers (is that more than one piece?). A nameserver can issue a query to a root nameserver for any domain name in the domain namespace, and the root nameserver will start the nameserver on its way.
The root nameservers know where the authoritative nameservers for each of the top-level zones are. (In fact, some of the root nameservers are authoritative for some of the generic top-level zones.) Given a query about any domain name, the root nameservers can at least provide the names and addresses of the nameservers that are authoritative for the top-level zone the domain name ends in. In turn, the top-level nameservers can provide the list of authoritative nameservers for the second-level zone that the domain name ends in. Each nameserver queried either gives the querier information about how to get "closer" to the answer it's seeking or provides the answer itself. The root nameservers are clearly important to resolution. Because they're so important, DNS provides mechanisms—such as caching, which we'll discuss a little later—to help offload the root nameservers. But in the absence of other information, resolution has to start at the root nameservers. This makes the root nameservers crucial to the operation of DNS; if all the Internet root nameservers were unreachable for an extended period, all resolution on the Internet would fail. To protect against this, the Internet has 13 root nameservers (as of this writing) spread across different parts of the network. One is on PSINet, a commercial Internet backbone; one is on the NASA Science Internet; two are in Europe; and one is in Japan.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Caching
The whole resolution process may seem awfully convoluted and cumbersome to someone accustomed to simple searches through the host table. Actually, though, it's usually quite fast. One of the features that speeds it up considerably is caching.
A nameserver processing a recursive query may have to send out quite a few queries to find an answer. However, it discovers a lot of information about the domain namespace as it does so. Each time it's referred to another list of nameservers, it learns that those nameservers are authoritative for some zone, and it learns the addresses of those servers. At the end of the resolution process, when it finally finds the data the original querier sought, it can store that data for future reference, too. The BIND nameserver even implements negative caching: if a nameserver responds to a query with an answer that says the domain name or data type in the query doesn't exist, the local nameserver will also temporarily cache that information.
Nameservers cache all this data to help speed up successive queries. The next time a resolver queries the nameserver for data about a domain name the nameserver knows something about, the process is shortened quite a bit. The nameserver may have cached the answer, positive or negative, in which case it simply returns the answer to the resolver. Even if it doesn't have the answer cached, it may have learned the identities of the nameservers that are authoritative for the zone the domain name is in and be able to query them directly.
For example, say our nameserver has already looked up the address of eecs.berkeley.edu. In the process, it cached the names and addresses of the eecs.berkeley.edu and berkeley.edu nameservers (plus eecs.berkeley.edu's IP address). Now if a resolver were to query our nameserver for the address of baobab.cs.berkeley.edu, our nameserver could skip querying the root nameservers. Recognizing that berkeley.edu is the closest ancestor of baobab.cs.berkeley.edu that it knows about, our nameserver would start by querying a
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 3: Where Do I Start?
"What do you call yourself?" the Fawn said at last. Such a soft sweet voice it had!
"I wish I knew!" thought poor Alice. She answered, rather sadly, "Nothing, just now."
"Think again," it said: "that won't do."
Alice thought, but nothing came of it. "Please, would you tell me what you call yourself?" she said timidly. "I think that might help a little."
"I'll tell you, if you come a little further on," the Fawn said. "I can't remember here."
Now that you understand the theory behind the Domain Name System, we can attend to more practical matters. Before you set up your zones, you may need to get the BIND software. Usually, it's standard in most Unix-based operating systems. Often, though, you'll want to seek out a more recent version with all the latest functionality and security enhancements.
Once you have BIND, you need to decide on a domain name for your main zone; this may not be quite as easy as it sounds because it entails finding an appropriate place in the Internet namespace. With that decided, you need to contact the administrators of the parent of the zone whose domain name you've chosen.
One thing at a time, though. Let's talk about where to get BIND.
If you plan to set up your own zones and run nameservers for them, you first need the BIND software. Even if you plan to have someone else run the nameservers for your zones, it's helpful to have the software around. For example, you can use a local nameserver to test your zone datafiles before giving them to your remote nameserver administrator.
Most commercial Unix vendors ship BIND with the rest of their standard TCP/IP networking software. And the networking software is usually included with the operating system, so you get BIND free. Even if the networking software is priced separately, you've probably already bought it, since you clearly do enough networking to need DNS, right?
If you don't have a version of BIND for your flavor of Unix, though, or if you want the latest, greatest version, you can always get the source code. As luck would have it, it's freely distributed. The source code for the most up-to-date versions of BIND as of this writing (the BIND 8.4.7 and 9.3.2 releases) is available via anonymous FTP from the Internet Software Consortium's web site,
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Getting BIND
If you plan to set up your own zones and run nameservers for them, you first need the BIND software. Even if you plan to have someone else run the nameservers for your zones, it's helpful to have the software around. For example, you can use a local nameserver to test your zone datafiles before giving them to your remote nameserver administrator.
Most commercial Unix vendors ship BIND with the rest of their standard TCP/IP networking software. And the networking software is usually included with the operating system, so you get BIND free. Even if the networking software is priced separately, you've probably already bought it, since you clearly do enough networking to need DNS, right?
If you don't have a version of BIND for your flavor of Unix, though, or if you want the latest, greatest version, you can always get the source code. As luck would have it, it's freely distributed. The source code for the most up-to-date versions of BIND as of this writing (the BIND 8.4.7 and 9.3.2 releases) is available via anonymous FTP from the Internet Software Consortium's web site, ftp.isc.org, in /isc/bind/src/8.4.7/bind-src.tar.gz and /isc/bind9/9.3.2/bind-9.3.2.tar.gz, respectively. Compiling these releases on most common Unix platforms is relatively straightforward. The ISC includes a list of Unix-ish operating systems that BIND is known to compile on in the file src/INSTALL (for BIND 8) and README (for BIND 9), including several versions of Linux, Unix, and even Windows. There's also a list of other Unix-ish and not-so-Unix-ish (MPE, anyone?) operating systems BIND has supported in the past, most recent versions of BIND will probably compile on these systems without much effort. Regardless of which category your operating system falls into, we strongly recommend reading all of the sections of the file that are relevant to your OS. In Appendix C, we also include instructions for compiling BIND 8.4.7 and 9.3.2 on Linux; it's a remarkably short appendix.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Choosing a Domain Name
Choosing a domain name is more involved than it may sound because it entails both choosing a name and finding out who runs the parent zone. In other words, you need to find out where you fit in the Internet domain namespace, then find out who runs that particular corner of the namespace.
The first step in picking a domain name is finding where in the existing domain namespace you belong. It's easiest to start at the top and work your way down: decide which top-level domain you belong in, then which of that top-level domain's subdomains you fit into.
Note that in order to find out what the Internet domain namespace looks like (beyond what we've already told you), you'll need access to the Internet. You don't need access to a host that already has name service configured, but it would help a little. If you don't have access to a host with DNS configured, you'll have to "borrow" name service from other nameservers (as in our previous ftp.isc.org example) to get you going.
Before we go any further, we need to define a few terms: registry, registrar, and registration. These terms aren't defined anywhere in the DNS specs. Instead, they apply to the way the Internet namespace is managed today.
A registry is an organization responsible for maintaining a top-level domain's (well, zone's, really) datafiles, which contain the delegation to each subdomain of that top-level domain. Under the current structure of the Internet, a given top-level domain can have no more than one registry. A registrar acts as an interface between customers and registries, providing registration and value-added services. Once a customer has chosen a subdomain of a top-level zone, the customer's registrar submits to the appropriate registry the zone data necessary to delegate that subdomain to the nameservers the customer specified. The registries act, more or less, as wholesalers of delegation in their top-level zone. The registrars then act as retailers, usually reselling delegation in more than one registry.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 4: Setting Up BIND
"It seems very pretty," she said when she had finished it, "but it's rather hard to understand!" (You see she didn't like to confess, even to herself, that she couldn't make it out at all.) "Somehow it seems to fill my head with ideas—only I don't exactly know what they are!"
If you have been diligently reading each chapter of this book, you're probably anxious to get a nameserver running. This chapter is for you. Let's set up a couple of nameservers. Others of you may have read the table of contents and skipped directly to this chapter. (Shame on you!) If you are one of those people, be aware that we may use concepts from earlier chapters and expect you to understand them already.
There are several factors that influence how you should set up your nameservers. The biggest is what sort of access you have to the Internet: complete access (e.g., you can FTP to ftp.rs.internic.net), limited access (restricted by a security firewall), or no access at all. This chapter assumes you have complete access. We'll discuss the other cases in Chapter 11.
In this chapter, we set up two nameservers for a few fictitious zones as an example for you to follow in setting up your own zones. We cover the topics in this chapter in enough detail to get your first two nameservers running. Subsequent chapters fill in the holes and go into greater depth. If you already have your nameservers running, skim through this chapter to familiarize yourself with the terms we use or just to verify that you didn't miss something when you set up your servers.
Our fictitious zone serves a college. Movie University studies all aspects of the film industry and researches novel ways to (legally) distribute films. One of our most promising projects involves research into using IP as a film distribution medium. After visiting our registrar's web site, we have decided on the domain name movie.edu. A recent grant has enabled us to connect to the Internet.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Our Zone
Our fictitious zone serves a college. Movie University studies all aspects of the film industry and researches novel ways to (legally) distribute films. One of our most promising projects involves research into using IP as a film distribution medium. After visiting our registrar's web site, we have decided on the domain name movie.edu. A recent grant has enabled us to connect to the Internet.
Movie U. currently has two Ethernets, and we have plans to add another network or two. The Ethernets have network numbers 192.249.249/24 and 192.253.253/24. A portion of our host table contains the following entries:
127.0.0.1      localhost

# These are our main machines

192.249.249.2  shrek.movie.edu shrek
192.249.249.3  toystory.movie.edu toystory toys
192.249.249.4  monsters-inc.movie.edu monsters-inc mi

# These machines are in horror(ible) shape and will be replaced
# soon.

192.253.253.2  misery.movie.edu misery
192.253.253.3  shining.movie.edu shining
192.253.253.4  carrie.movie.edu carrie

# A wormhole is a fictitious phenomenon that instantly transports
# space travelers over long distances and is not known to be
# stable.  The only difference between wormholes and routers is
# that routers don't transport packets as instantly—especially
# ours.

192.249.249.1  wormhole.movie.edu wormhole wh wh249
192.253.253.1  wormhole.movie.edu wormhole wh wh253
And the network is pictured in Figure 4-1.
Figure 4-1: The Movie University network
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Setting Up Zone Data
Our first step in setting up the Movie U. nameservers is to translate the host table into equivalent DNS zone data. The DNS version of the data has multiple files. One file maps all the hostnames to addresses. Other files map the addresses back to hostnames. The name-to-address lookup is sometimes called forward mapping, and the address-to-name lookup, reverse mapping. Each network has its own file for reverse-mapping data.
As a convention in this book, a file that maps hostnames to addresses is called db.DOMAIN. For movie.edu, this file is called db.movie.edu. The files mapping addresses to hostnames are called db.ADDR, where ADDR is the network number without trailing zeros or the specification of a netmask. In our example, the files are called db.192.249.249 and db.192.253.253; there's one for each network. (The db is short for database.) We'll refer to the collection of db.DOMAIN and db.ADDR files as zone datafiles. There are a few other zone datafiles: db.cache and db.127.0.0. These files are overhead. Each nameserver must have them, and they are more or less the same for each server.
To tie all the zone datafiles together, a nameserver needs a configuration file; for BIND versions 8 and 9, it is usually called named.conf. The format of the zone datafiles is common to all DNS implementations: it's called the master file format. The format of the configuration files, on the other hand, is specific to the nameserver implementation—in this case, BIND.
Most entries in zone datafiles are called DNS resource records. DNS lookups are case-insensitive, so you can enter names in your zone datafiles in uppercase, lowercase, or mixed case. We tend to use all lowercase. However, even though lookups are case-insensitive, case is preserved. That way, if you add records for Titanic.movie.edu to your zone data, people looking up titanic.movie.edu will find the records, but with a capital "T" in the domain name.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Setting Up a BIND Configuration File
Now that we've created the zone datafiles, a nameserver must be instructed to read each file. For BIND, the mechanism for pointing the server to its zone datafiles is the configuration file. Up to this point, we've been discussing files whose data and format are described in the DNS specifications. The syntax of the configuration file, though, is specific to BIND and is not defined in the DNS RFCs.
The BIND configuration file syntax changed significantly between version 4 and version 8. Mercifully, it didn't change at all between BIND 8 and BIND 9. BIND 4 came out so long ago that we are not going to cover its configuration here. You'll have to find an earlier edition of our book (you should be able to find a cheap used copy) if you are still running one of those ancient beasts. In the configuration file, you can use any of three styles of comments: C-style, C++-style, or shell-style:
/* This is a C-style comment */
// This is a C++-style comment
# This is a shell-style comment
Usually, configuration files contain a line indicating the directory in which the zone datafiles are located. The nameserver changes its directory to this location before reading the zone datafiles. This allows the filenames to be specified relative to the current directory instead of as full pathnames. Here's how a directory line looks in an options statement:
options {
        directory "/var/named";
        // Place additional options here.
};
Only one options statement is allowed in the configuration file, so any additional options mentioned later in this book must be added along with the directory option.
On a primary server, the configuration file contains one zone statement for each zone datafile to be read. Each line starts with the keyword zone followed by the zone's domain name and the class (in stands for Internet). The type master indicates this server is a primary nameserver. The last line contains the filename:
zone "movie.edu" in {
      type master;
      file "db.movie.edu";
};
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Abbreviations
At this point, we have created all the files necessary for a primary nameserver. Let's go back and revisit the zone datafiles; there are shortcuts we didn't use. Unless you see and understand the long form first, though, the short form can look very cryptic. Now that you know the long form and have seen the BIND configuration file, we'll show you the shortcuts.
The second field of a zone statement specifies a domain name. This domain name is the key to the most useful shortcut. This domain name is the origin of all the data in the zone datafile. The origin is appended to all names in the zone datafile that don't end in a dot and will be different for each zone datafile because each file describes a different zone.
Since the origin is appended to names, instead of entering shrek.movie.edu's address in db.movie.edu like this:
shrek.movie.edu.    IN A     192.249.249.2
we could have entered it like this:
shrek    IN A     192.249.249.2
In the db.192.24.249 file, we entered this:
2.249.249.192.in-addr.arpa.  IN PTR shrek.movie.edu.
Because 249.249.192.in-addr.arpa is the origin, we could have entered:
2  IN PTR shrek.movie.edu.
Remember our earlier warning not to omit the trailing dot when using the fully qualified domain names? Suppose you forget the trailing dot. An entry like:
shrek.movie.edu    IN A     192.249.249.2
turns into an entry for shrek.movie.edu.movie.edu, not what you intended at all.
If a domain name is the same as the origin, the name can be specified as "@". This is most often seen in the SOA record in the zone datafiles. The SOA records could have been entered this way:
@ IN SOA toystory.movie.edu. al.movie.edu. (
                          1        ; Serial
                          3h       ; Refresh after 3 hours
                          1h       ; Retry after 1 hour
                          1w       ; Expire after 1 week
                          1h )     ; Negative caching TTL of 1 hour
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Hostname Checking
If your nameserver is BIND 4.9.4 or newer (and most are), you have to pay extra attention to how your hosts are named. Starting with Version 4.9.4, BIND checks hostnames for conformance to RFC 952. If a hostname does not conform, BIND considers it a syntax error.
Before you panic, you need to know that this checking applies only to names that are considered hostnames. Remember, resource records have a name field and a data field—for example:
<name>      <class>  <type>  <data>
toystory    IN       A       192.249.249.3
Hostnames are in the name fields of A (address) and MX (covered in Chapter 5) records. Hostnames are also in the data fields of SOA and NS records. CNAMEs do not have to conform to the host-naming rules because they can point to names that are not hostnames.
Let's look at the host-naming rules. Hostnames are allowed to contain alphabetic characters and numeric characters in each label. The following are valid hostnames:
ID4            IN A 192.249.249.10
postmanring2x  IN A 192.249.249.11
A hyphen is allowed if it is in the middle of a label:
fx-gateway     IN A 192.249.249.12
Underscores are not allowed in hostnames.
Names that are not hostnames can consist of any printable ASCII character.
If a resource record data field calls for a mail address (as in SOA records), the first label, since it is not a hostname, can contain any printable character, but the rest of the labels must follow the hostname syntax just described. For example, a mail address has the following syntax:
<ASCII-characters>.<hostname-characters>
For example, if your mail address is key_grip@movie.edu, you can use it in an SOA record even with the underscore. Remember, in a mail address you replace the "@" with a ".", like this:
movie.edu. IN SOA toystory.movie.edu. key_grip.movie.edu. (
                          1        ; Serial
                          3h       ; Refresh after 3 hours
                          1h       ; Retry after 1 hour
                          1w       ; Expire after 1 week
                          1h )     ; Negative caching TTL of 1 hour
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Tools
Wouldn't it be handy to have a tool to translate your host table into master file format? There is such a beast, written in Perl: h2n, a host-table-to-master-file converter. You can use h2n to create your zone datafiles the first time and then maintain your data manually. Or you can use h2n over and over again. As you've seen, the host table's format is much simpler to understand and modify correctly than the master file format. So, you could maintain /etc/hosts and rerun h2n to update your zone datafiles after each modification.
If you plan to use h2n, you might as well start with it, because it uses /etc/hosts—not your hand-crafted zone data—to generate the new zone datafiles. We could have saved ourselves a lot of work by generating the sample zone datafiles in this chapter with the following:
% h2n -d movie.edu -s toystory -s shrek \
-n 192.249.249 -n 192.253.253 \
-u al.movie.edu
(To generate a BIND 4 configuration file, add -v 4 to the option list.)
The -d and -n options specify the domain name of your forward-mapping zone and your network numbers. You'll notice that the names of the zone datafiles are derived from these options. The -s options list the authoritative nameservers for the zones to use in the NS records. The -u (user) is the email address in the SOA record. We cover h2n in more detail in Chapter 7, after we've covered how DNS affects email.
If you are running BIND 9, you have handy new tools to help maintain your nameserver files: named-checkconf and named-checkzone. These tools reside in /usr/local/sbin. As you might guess, named-checkconf checks the configuration file for syntax errors, and named-checkzone checks a zone file for syntax errors.
First, run named-checkconf, which checks /etc/named.conf by default:
% named-checkconf
If you have an error, named-checkconf displays an error message, such as this one:
/etc/named.conf:14: zone '.': missing 'file' entry
If there are no errors, you won't see any output.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Running a Primary Nameserver
Now that you've created your zone datafiles, you are ready to start a couple of nameservers. You'll need to set up two nameservers: a primary nameserver and a slave nameserver. Before you start a nameserver, though, make sure that the syslog daemon is running. If the nameserver reads the configuration file and zone datafiles and sees an error, it logs a message to the syslog daemon. If the error is bad enough, the nameserver exits. If you've run the BIND 9 named-checkconf and named-checkzone, you should be all set, but check for syslog errors anyway, just to be safe.
At this point, we assume the machine you are running on has the BIND nameserver and the support tool nslookup installed. Check the named manual page to find the directory the nameserver executable is in and verify that the executable is on your system. On BSD systems, the nameserver started its life in /etc, but may have migrated to /usr/sbin. Other places to look for named are /usr/etc/in.named and /usr/sbin/in.named. The following descriptions assume that the nameserver is in /usr/sbin.
To start up the nameserver, you must become root. The nameserver listens for queries on a reserved port, so it requires root privileges. The first time you run it, start the nameserver from the command line to test that it is operating correctly. Later, we'll show you how to start up the nameserver automatically when your system boots.
The following command starts the nameserver. We ran it on the host toystory.movie.edu.
# /usr/sbin/named
This command assumes that your configuration file is called /etc/named.conf. You can put your configuration file elsewhere, but the