A module can’t accomplish its task without using system resources such as memory, I/O ports, I/O memory, and interrupt lines, as well as DMA channels if you use old-fashioned DMA controllers like the Industry Standard Architecture (ISA) one.
As a programmer, you are already accustomed to managing memory
allocation; writing kernel code is no different in this regard. Your
program obtains a memory area using kmalloc and
releases it using kfree. These functions behave
like malloc and free, except
that kmalloc takes an additional argument, the
priority. Usually, a priority of GFP_KERNEL
or
GFP_USER
will do. The GFP
acronym stands for “get free page.” (Memory allocation is covered in
detail in Chapter 7.)
Beginning driver programmers may initially be surprised at the need to allocate I/O ports, I/O memory,[11] and interrupt lines explicitly. After all, it is possible for a kernel module to simply access these resources without telling the operating system about it. Although system memory is anonymous and may be allocated from anywhere, I/O memory, ports, and interrupts have very specific roles. For instance, a driver needs to be able to allocate the exact ports it needs, not just some ports. But drivers cannot just go about making use of these system resources without first ensuring that they are not already in use elsewhere.
The job of a typical driver is, for the most part, writing and reading I/O ports and I/O memory. Access to I/O ports and I/O memory (collectively called I/O regions) happens both at initialization time and during normal operations.
Unfortunately, not all bus architectures offer a clean way to identify I/O regions belonging to each device, and sometimes the driver must guess where its I/O regions live, or even probe for the devices by reading and writing to “possible” address ranges. This problem is especially true of the ISA bus, which is still in use for simple devices to plug in a personal computer and is very popular in the industrial world in its PC/104 implementation (see Section 15.3 in Chapter 15).
Despite the features (or lack of features) of the bus being used by a hardware device, the device driver should be guaranteed exclusive access to its I/O regions in order to prevent interference from other drivers. For example, if a module probing for its hardware should happen to write to ports owned by another device, weird things would undoubtedly happen.
The developers of Linux chose to implement a request/free mechanism for I/O regions, mainly as a way to prevent collisions between different devices. The mechanism has long been in use for I/O ports and was recently generalized to manage resource allocation at large. Note that this mechanism is just a software abstraction that helps system housekeeping, and may or may not be enforced by hardware features. For example, unauthorized access to I/O ports doesn’t produce any error condition equivalent to “segmentation fault”—the hardware can’t enforce port registration.
Information about registered resources is available in text form in
the files /proc/ioports
and
/proc/iomem
, although the latter was only
introduced during 2.3 development.
We’ll discuss version 2.4 now, introducing portability issues at
the end of the chapter.
A typical /proc/ioports
file on a recent PC that
is running version 2.4 of the kernel will look like the following:
0000-001f : dma1 0020-003f : pic1 0040-005f : timer 0060-006f : keyboard 0080-008f : dma page reg 00a0-00bf : pic2 00c0-00df : dma2 00f0-00ff : fpu 0170-0177 : ide1 01f0-01f7 : ide0 02f8-02ff : serial(set) 0300-031f : NE2000 0376-0376 : ide1 03c0-03df : vga+ 03f6-03f6 : ide0 03f8-03ff : serial(set) 1000-103f : Intel Corporation 82371AB PIIX4 ACPI 1000-1003 : acpi 1004-1005 : acpi 1008-100b : acpi 100c-100f : acpi 1100-110f : Intel Corporation 82371AB PIIX4 IDE 1300-131f : pcnet_cs 1400-141f : Intel Corporation 82371AB PIIX4 ACPI 1800-18ff : PCI CardBus #02 1c00-1cff : PCI CardBus #04 5800-581f : Intel Corporation 82371AB PIIX4 USB d000-dfff : PCI Bus #01 d000-d0ff : ATI Technologies Inc 3D Rage LT Pro AGP-133
Each entry in the file specifies (in hexadecimal) a range of ports locked by a driver or owned by a hardware device. In earlier versions of the kernel the file had the same format, but without the “layered” structure that is shown through indentation.
The file can be used to avoid port collisions when a new device is added to the system and an I/O range must be selected by moving jumpers: the user can check what ports are already in use and set up the new device to use an available I/O range. Although you might object that most modern hardware doesn’t use jumpers any more, the issue is still relevant for custom devices and industrial components.
But what is more important than the ioports
file
itself is the data structure behind it. When the software driver for a
device initializes itself, it can know what port ranges are already in
use; if the driver needs to probe I/O ports to detect the new device,
it will be able to avoid probing those ports that are already in use
by other drivers.
ISA probing is in fact a risky task, and several drivers distributed with the official Linux kernel refuse to perform probing when loaded as modules, to avoid the risk of destroying a running system by poking around in ports where some yet-unknown hardware may live. Fortunately, modern (as well as old-but-well-thought-out) bus architectures are immune to all these problems.
The programming interface used to access the I/O registry is made up of three functions:
int check_region(unsigned long start, unsigned long len); struct resource *request_region(unsigned long start, unsigned long len, char *name); void release_region(unsigned long start, unsigned long len);
check_region may be called to see if a range of
ports is available for allocation; it returns a negative error code
(such as -EBUSY
or -EINVAL
) if
the answer is no. request_region will actually
allocate the port range, returning a non-NULL
pointer value if the allocation succeeds. Drivers don’t need to use or
save the actual pointer returned—checking against
NULL
is all you need to do.[12]
Code that needs to work only with 2.4 kernels need not call
check_region at all; in fact, it’s better not to,
since things can change between the calls to
check_region and
request_region. If you want to be portable to
older kernels, however, you must use check_region
because request_region used to return
void
before 2.4.
Your driver should call release_region, of
course, to release the ports when it is done with them.
The three functions are actually macros, and they are declared in
<linux/ioport.h>
.
The typical sequence for registering ports is the following, as it appears in the skull sample driver. (The function skull_probe_hw is not shown here because it contains device-specific code.)
#include <linux/ioport.h> #include <linux/errno.h> static int skull_detect(unsigned int port, unsigned int range) { int err; if ((err = check_region(port,range)) < 0) return err; /* busy */ if (skull_probe_hw(port,range) != 0) return -ENODEV; /* not found */ request_region(port,range,"skull"); /* "Can't fail" */ return 0; }
This code first looks to see if the required range of ports is
available; if the ports cannot be allocated, there is no point in
looking for the hardware. The actual allocation of the ports is
deferred until after the device is known to exist. The
request_region call should never fail; the kernel
only loads a single module at a time, so there should not be a problem
with other modules slipping in and stealing the ports during the
detection phase. Paranoid code can check, but bear in mind that
kernels prior to 2.4 define request_region as
returning void
.
Any I/O ports allocated by the driver must eventually be released; skull does it from within cleanup_module:
static void skull_release(unsigned int port, unsigned int range) { release_region(port,range); }
The request/free approach to resources is similar to the
register/unregister sequence described earlier for facilities and fits
well in the goto
-based implementation scheme
already outlined.
Similar to what happens for I/O ports, I/O memory information is
available in the /proc/iomem
file. This is a
fraction of the file as it appears on a personal computer:
00000000-0009fbff : System RAM 0009fc00-0009ffff : reserved 000a0000-000bffff : Video RAM area 000c0000-000c7fff : Video ROM 000f0000-000fffff : System ROM 00100000-03feffff : System RAM 00100000-0022c557 : Kernel code 0022c558-0024455f : Kernel data 20000000-2fffffff : Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge 68000000-68000fff : Texas Instruments PCI1225 68001000-68001fff : Texas Instruments PCI1225 (#2) e0000000-e3ffffff : PCI Bus #01 e4000000-e7ffffff : PCI Bus #01 e4000000-e4ffffff : ATI Technologies Inc 3D Rage LT Pro AGP-133 e6000000-e6000fff : ATI Technologies Inc 3D Rage LT Pro AGP-133 fffc0000-ffffffff : reserved
Once again, the values shown are hexadecimal ranges, and the string after the colon is the name of the “owner” of the I/O region.
As far as driver writing is concerned, the registry for I/O memory is accessed in the same way as for I/O ports, since they are actually based on the same internal mechanism.
To obtain and relinquish access to a certain I/O memory region, the driver should use the following calls:
int check_mem_region(unsigned long start, unsigned long len); int request_mem_region(unsigned long start, unsigned long len, char *name); int release_mem_region(unsigned long start, unsigned long len);
A typical driver will already know its own I/O memory range, and the sequence shown previously for I/O ports will reduce to the following:
if (check_mem_region(mem_addr, mem_size)) { printk("drivername: memory already in use\n"); return -EBUSY; } request_mem_region(mem_addr, mem_size, "drivername");
The current resource allocation mechanism was introduced in Linux
2.3.11 and provides a flexible way of controlling system resources.
This section briefly describes the mechanism. However, the basic
resource allocation functions (request_region
and
the rest) are still implemented (via macros) and are still universally
used because they are backward compatible with earlier kernel
versions. Most module programmers will not need to know about what is
really happening under the hood, but those working on more complex
drivers may be interested.
Linux resource management is able to control arbitrary resources, and it can do so in a hierarchical manner. Globally known resources (the range of I/O ports, say) can be subdivided into smaller subsets—for example, the resources associated with a particular bus slot. Individual drivers can then further subdivide their range if need be.
Resource ranges are described via a resource
structure, declared in <linux/ioport.h>
:
struct resource { const char *name; unsigned long start, end; unsigned long flags; struct resource *parent, *sibling, *child; };
Top-level (root) resources are created at boot time. For example, the resource structure describing the I/O port range is created as follows:
struct resource ioport_resource = { "PCI IO", 0x0000, IO_SPACE_LIMIT, IORESOURCE_IO };
Thus, the name of the resource is PCI IO
, and it
covers a range from zero through IO_SPACE_LIMIT
,
which, according to the hardware platform being run, can be
0xffff
(16 bits of address space, as happens on the
x86, IA-64, Alpha, M68k, and MIPS), 0xffffffff
(32
bits: SPARC, PPC, SH) or 0xffffffffffffffff
(64
bits: SPARC64).
Subranges of a given resource may be created with
allocate_resource
. For example, during PCI
initialization a new resource is created for a region that is actually
assigned to a physical device. When the PCI code reads those port or
memory assignments, it creates a new resource for just those regions,
and allocates them under ioport_resource
or
iomem_resource
.
A driver can then request a subset of a particular resource (actually
a subrange of a global resource) and mark it as busy by calling
__request_region, which returns a pointer
to a new struct resource
data structure that
describes the resource being requested (or returns
NULL
in case of error). The structure is already
part of the global resource tree, and the driver is not allowed to use
it at will.
An interested reader may enjoy looking at the details by browsing the
source in kernel/resource.c
and looking at the
use of the resource management scheme in the rest of the kernel. Most
driver writers, however, will be more than adequately served by
request_region
and the other functions introduced
in the previous section.
This layered mechanism brings a couple of benefits. One is that it
makes the I/O structure of the system apparent within the data
structures of the kernel. The result shows up in
/proc/ioports
, for example:
e800-e8ff : Adaptec AHA-2940U2/W / 7890 e800-e8be : aic7xxx
The range e800-e8ff
is allocated to an Adaptec
card, which has identified itself to the PCI bus driver. The
aic7xxx
driver has then requested most of that
range—in this case, the part corresponding to real ports on the
card.
The other advantage to controlling resources in this way is that it partitions the port space into distinct subranges that reflect the hardware of the underlying system. Since the resource allocator will not allow an allocation to cross subranges, it can block a buggy driver (or one looking for hardware that does not exist on the system) from allocating ports that belong to more than range—even if some of those ports are unallocated at the time.
Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.