If you want to actually “see” interrupts being generated, writing to the hardware device isn’t enough; a software handler must be configured in the system. If the Linux kernel hasn’t been told to expect your interrupt, it will simply acknowledge and ignore it.
Interrupt lines are a precious and often limited resource,
particularly when there are only 15 or 16 of them. The kernel keeps a
registry of interrupt lines, similar to the registry of I/O ports. A
module is expected to request an interrupt channel (or IRQ, for
interrupt request) before using it, and to release it when it’s done.
In many situations, modules are also expected to be able to share
interrupt lines with other drivers, as we will see. The following
functions, declared in <linux/sched.h>
,
implement the interface:
int request_irq(unsigned int irq, void (*handler)(int, void *, struct pt_regs *), unsigned long flags, const char *dev_name, void *dev_id); void free_irq(unsigned int irq, void *dev_id);
The value returned from request_irq to the
requesting function is either 0 to indicate success or a negative
error code, as usual. It’s not uncommon for the function to return
-EBUSY
to signal that another driver is already
using the requested interrupt line. The arguments to the functions
are as follows:
-
unsigned int irq
-
void (*handler)(int, void *, struct pt_regs *)
The pointer to the handling function being installed. We’ll discuss the arguments to this function later in this chapter.
-
unsigned long flags
As you might expect, a bit mask of options (described later) related to interrupt management.
-
const char *dev_name
The string passed to request_irq is used in
/proc/interrupts
to show the owner of the interrupt (see the next section).-
void *dev_id
This pointer is used for shared interrupt lines. It is a unique identifier that is used when the interrupt line is freed and that may also be used by the driver to point to its own private data area (to identify which device is interrupting). When no sharing is in force,
dev_id
can be set toNULL
, but it a good idea anyway to use this item to point to the device structure. We’ll see a practical use fordev_id
in Section 9.4, later in this chapter.
The bits that can be set in flags
are as follows:
-
SA_INTERRUPT
When set, this indicates a “fast” interrupt handler. Fast handlers are executed with interrupts disabled (the topic is covered in deeper detail later in this chapter, in Section 9.3.3).
-
SA_SHIRQ
This bit signals that the interrupt can be shared between devices. The concept of sharing is outlined in Section 9.6, later in this chapter.
-
SA_SAMPLE_RANDOM
This bit indicates that the generated interrupts can contribute to the entropy pool used by
/dev/random
and/dev/urandom
. These devices return truly random numbers when read and are designed to help application software choose secure keys for encryption. Such random numbers are extracted from an entropy pool that is contributed by various random events. If your device generates interrupts at truly random times, you should set this flag. If, on the other hand, your interrupts will be predictable (for example, vertical blanking of a frame grabber), the flag is not worth setting—it wouldn’t contribute to system entropy anyway. Devices that could be influenced by attackers should not set this flag; for example, network drivers can be subjected to predictable packet timing from outside and should not contribute to the entropy pool. See the comments indrivers/char/random.c
for more information.
The interrupt handler can be installed either at driver initialization or when the device is first opened. Although installing the interrupt handler from within the module’s initialization function might sound like a good idea, it actually isn’t. Because the number of interrupt lines is limited, you don’t want to waste them. You can easily end up with more devices in your computer than there are interrupts. If a module requests an IRQ at initialization, it prevents any other driver from using the interrupt, even if the device holding it is never used. Requesting the interrupt at device open, on the other hand, allows some sharing of resources.
It is possible, for example, to run a frame grabber on the same interrupt as a modem, as long as you don’t use the two devices at the same time. It is quite common for users to load the module for a special device at system boot, even if the device is rarely used. A data acquisition gadget might use the same interrupt as the second serial port. While it’s not too hard to avoid connecting to your Internet service provider (ISP) during data acquisition, being forced to unload a module in order to use the modem is really unpleasant.
The correct place to call request_irq is when the device is first opened, before the hardware is instructed to generate interrupts. The place to call free_irq is the last time the device is closed, after the hardware is told not to interrupt the processor any more. The disadvantage of this technique is that you need to keep a per-device open count. Using the module count isn’t enough if you control two or more devices from the same module.
This discussion notwithstanding, short requests its interrupt line at load time. This was done so that you can run the test programs without having to run an extra process to keep the device open. short, therefore, requests the interrupt from within its initialization function (short_init) instead of doing it in short_open, as a real device driver would.
The interrupt requested by the following code is
short_irq
. The actual assignment of the variable
(i.e., determining which IRQ to use) is shown later, since it is not
relevant to the current discussion. short_base
is
the base I/O address of the parallel interface being used; register 2
of the interface is written to enable interrupt reporting.
if (short_irq >= 0) { result = request_irq(short_irq, short_interrupt, SA_INTERRUPT, "short", NULL); if (result) { printk(KERN_INFO "short: can't get assigned irq %i\n", short_irq); short_irq = -1; } else { /* actually enable it—assume this *is* a parallel port */ outb(0x10,short_base+2); } }
The code shows that the handler being installed is a fast handler
(SA_INTERRUPT
), does not support interrupt sharing
(SA_SHIRQ
is missing), and doesn’t contribute to
system entropy (SA_SAMPLE_RANDOM
is missing too).
The outb call then enables interrupt reporting
for the parallel port.
Whenever a hardware interrupt reaches the processor, an internal
counter is incremented, providing a way to check whether the device is
working as expected. Reported interrupts are shown in
/proc/interrupts
. The following snapshot was
taken after several days of uptime on a two-processor Pentium system:
CPU0 CPU1 0: 34584323 34936135 IO-APIC-edge timer 1: 224407 226473 IO-APIC-edge keyboard 2: 0 0 XT-PIC cascade 5: 5636751 5636666 IO-APIC-level eth0 9: 0 0 IO-APIC-level acpi 10: 565910 565269 IO-APIC-level aic7xxx 12: 889091 884276 IO-APIC-edge PS/2 Mouse 13: 1 0 XT-PIC fpu 15: 1759669 1734520 IO-APIC-edge ide1 NMI: 69520392 69520392 LOC: 69513717 69513716 ERR: 0
The first column is the IRQ number. You can see from the IRQs that are missing that the file shows only interrupts corresponding to installed handlers. For example, the first serial port (which uses interrupt number 4) is not shown, indicating that the modem isn’t being used. In fact, even if the modem had been used earlier but wasn’t in use at the time of the snapshot, it would not show up in the file; the serial ports are well behaved and release their interrupt handlers when the device is closed.
The /proc/interrupts
display shows how many
interrupts have been delivered to each CPU on the system. As you can
see from the output, the Linux kernel tries to divide interrupt
traffic evenly across the processors, with some success. The final
columns give information on the programmable interrupt controller that
handles the interrupt (and which a driver writer need not worry
about), and the name(s) of the device(s) that have registered handlers
for the interrupt (as specified in the dev_name
argument to request_irq).
The /proc
tree contains another interrupt-related
file, /proc/stat
; sometimes you’ll find one file
more useful and sometimes you’ll prefer the
other. /proc/stat
records several low-level
statistics about system activity, including (but not limited to) the
number of interrupts received since system boot. Each line of
stat
begins with a text string that is the key to
the line; the intr
mark is what we are looking
for. The following (truncated and line-broken) snapshot was taken
shortly after the previous one:
intr 884865 695557 4527 0 3109 4907 112759 3 0 0 0 11314 0 17747 1 0 34941 0 0 0 0 0 0 0
The first number is the total of all interrupts, while each of the
others represents a single IRQ line, starting with interrupt 0. This
snapshot shows that interrupt number 4 has been used 4907 times, even
though no handler is currently installed. If the
driver you’re testing acquires and releases the interrupt at each open
and close cycle, you may find /proc/stat
more
useful than /proc/interrupts
.
Another difference between the two files is that
interrupts
is not architecture dependent, whereas
stat
is: the number of fields depends on the
hardware underlying the kernel. The number of available interrupts
varies from as few as 15 on the SPARC to as many as 256 on the IA-64
and a few other systems. It’s interesting to note that the number of
interrupts defined on the x86 is currently 224, not 16 as you may
expect; this, as explained in
include/asm-i386/irq.h
, depends on Linux using
the architectural limit instead of an implementation-specific limit
(like the 16 interrupt sources of the old-fashioned PC interrupt
controller).
The following is a snapshot of /proc/interrupts
taken on an IA-64 system. As you can see, besides different hardware
routing of common interrupt sources, there’s no platform dependency
here.
CPU0 CPU1 27: 1705 34141 IO-SAPIC-level qla1280 40: 0 0 SAPIC perfmon 43: 913 6960 IO-SAPIC-level eth0 47: 26722 146 IO-SAPIC-level usb-uhci 64: 3 6 IO-SAPIC-edge ide0 80: 4 2 IO-SAPIC-edge keyboard 89: 0 0 IO-SAPIC-edge PS/2 Mouse 239: 5606341 5606052 SAPIC timer 254: 67575 52815 SAPIC IPI NMI: 0 0 ERR: 0
One of the most compelling problems for a driver at initialization time can be how to determine which IRQ line is going to be used by the device. The driver needs the information in order to correctly install the handler. Even though a programmer could require the user to specify the interrupt number at load time, this is a bad practice because most of the time the user doesn’t know the number, either because he didn’t configure the jumpers or because the device is jumperless. Autodetection of the interrupt number is a basic requirement for driver usability.
Sometimes autodetection depends on the knowledge that some devices feature a default behavior that rarely, if ever, changes. In this case, the driver might assume that the default values apply. This is exactly how short behaves by default with the parallel port. The implementation is straightforward, as shown by short itself:
if (short_irq < 0) /* not yet specified: force the default on */ switch(short_base) { case 0x378: short_irq = 7; break; case 0x278: short_irq = 2; break; case 0x3bc: short_irq = 5; break; }
The code assigns the interrupt number according to the chosen base I/O address, while allowing the user to override the default at load time with something like
insmod ./short.o short_irq=x
.
short_base
defaults to 0x378
, so
short_irq
defaults to 7.
Some devices are more advanced in design and simply “announce” which interrupt they’re going to use. In this case, the driver retrieves the interrupt number by reading a status byte from one of the device’s I/O ports or PCI configuration space. When the target device is one that has the ability to tell the driver which interrupt it is going to use, autodetecting the IRQ number just means probing the device, with no additional work required to probe the interrupt.
It’s interesting to note here that modern devices supply their interrupt configuration. The PCI standard solves the problem by requiring peripheral devices to declare what interrupt line(s) they are going to use. The PCI standard is discussed in Chapter 15.
Unfortunately, not every device is programmer friendly, and autodetection might require some probing. The technique is quite simple: the driver tells the device to generate interrupts and watches what happens. If everything goes well, only one interrupt line is activated.
Though probing is simple in theory, the actual implementation might be unclear. We’ll look at two ways to perform the task: calling kernel-defined helper functions and implementing our own version.
The Linux kernel offers a low-level facility for probing the interrupt
number. It only works for nonshared interrupts, but then most
hardware that is capable of working in a shared interrupt mode
provides better ways of finding the configured interrupt number. The
facility consists of two functions, declared in
<linux/interrupt.h>
(which also describes the
probing machinery):
-
unsigned long probe_irq_on(void);
This function returns a bit mask of unassigned interrupts. The driver must preserve the returned bit mask and pass it to probe_irq_off later. After this call, the driver should arrange for its device to generate at least one interrupt.
-
int probe_irq_off(unsigned long);
After the device has requested an interrupt, the driver calls this function, passing as argument the bit mask previously returned by probe_irq_on. probe_irq_off returns the number of the interrupt that was issued after “probe_on.” If no interrupts occurred, 0 is returned (thus, IRQ 0 can’t be probed for, but no custom device can use it on any of the supported architectures anyway). If more than one interrupt occurred (ambiguous detection), probe_irq_off returns a negative value.
The programmer should be careful to enable interrupts on the device after the call to probe_irq_on and to disable them before calling probe_irq_off, Additionally, you must remember to service the pending interrupt in your device after probe_irq_off.
The short module demonstrates how to use
such probing. If you load the module with probe=1
,
the following code is executed to detect your interrupt line, provided
pins 9 and 10 of the parallel connector are bound together:
int count = 0; do { unsigned long mask; mask = probe_irq_on(); outb_p(0x10,short_base+2); /* enable reporting */ outb_p(0x00,short_base); /* clear the bit */ outb_p(0xFF,short_base); /* set the bit: interrupt! */ outb_p(0x00,short_base+2); /* disable reporting */ udelay(5); /* give it some time */ short_irq = probe_irq_off(mask); if (short_irq == 0) { /* none of them? */ printk(KERN_INFO "short: no irq reported by probe\n"); short_irq = -1; } /* * If more than one line has been activated, the result is * negative. We should service the interrupt (no need for lpt port) * and loop over again. Loop at most five times, then give up */ } while (short_irq < 0 && count++ < 5); if (short_irq < 0) printk("short: probe failed %i times, giving up\n", count);
Note the use of udelay before calling probe_irq_off. Depending on the speed of your processor, you may have to wait for a brief period to give the interrupt time to actually be delivered.
If you dig through the kernel sources, you may stumble across references to a different pair of functions:
These functions are used primarily in the network driver code, for historical reasons. They are currently implemented with probe_irq_on and probe_irq_off; there is not usually any reason to use the autoirq_ functions over the probe_irq_ functions.
Probing might be a lengthy task. While this is not true for short, probing a frame grabber, for example, requires a delay of at least 20 ms (which is ages for the processor), and other devices might take even longer. Therefore, it’s best to probe for the interrupt line only once, at module initialization, independently of whether you install the handler at device open (as you should) or within the initialization function (which is not recommended).
It’s interesting to note that on some platforms (PowerPC, M68k, most MIPS implementations, and both SPARC versions), probing is unnecessary and therefore the previous functions are just empty placeholders, sometimes called “useless ISA nonsense.” On other platforms, probing is only implemented for ISA devices. Anyway, most architectures define the functions (even if empty) to ease porting existing device drivers.
Generally speaking, probing is a hack, and mature architectures are like the PCI bus, which provides all the needed information.
Probing can be implemented in the driver itself without too much
trouble. The short module performs
do-it-yourself detection of the IRQ line if it is loaded with
probe=2
.
The mechanism is the same as the one described earlier: enable all unused interrupts, then wait and see what happens. We can, however, exploit our knowledge of the device. Often a device can be configured to use one IRQ number from a set of three or four; probing just those IRQs enables us to detect the right one, without having to test for all possible IRQs.
The short implementation assumes that 3, 5, 7, and 9 are the only possible IRQ values. These numbers are actually the values that some parallel devices allow you to select.
The following code probes by testing all “possible” interrupts and
looking at what happens. The trials
array lists
the IRQs to try and has 0
as the end marker; the
tried
array is used to keep track of which handlers
have actually been registered by this driver.
int trials[] = {3, 5, 7, 9, 0}; int tried[] = {0, 0, 0, 0, 0}; int i, count = 0; /* * Install the probing handler for all possible lines. Remember * the result (0 for success, or -EBUSY) in order to only free * what has been acquired */ for (i=0; trials[i]; i++) tried[i] = request_irq(trials[i], short_probing, SA_INTERRUPT, "short probe", NULL); do { short_irq = 0; /* none obtained yet */ outb_p(0x10,short_base+2); /* enable */ outb_p(0x00,short_base); outb_p(0xFF,short_base); /* toggle the bit */ outb_p(0x00,short_base+2); /* disable */ udelay(5); /* give it some time */ /* the value has been set by the handler */ if (short_irq == 0) { /* none of them? */ printk(KERN_INFO "short: no irq reported by probe\n"); } /* * If more than one line has been activated, the result is * negative. We should service the interrupt (but the lpt port * doesn't need it) and loop over again. Do it at most 5 times */ } while (short_irq <=0 && count++ < 5); /* end of loop, uninstall the handler */ for (i=0; trials[i]; i++) if (tried[i] == 0) free_irq(trials[i], NULL); if (short_irq < 0) printk("short: probe failed %i times, giving up\n", count);
You might not know in advance what the “possible” IRQ values are. In
that case, you’ll need to probe all the free interrupts, instead of
limiting yourself to a few trials[]
. To probe for
all interrupts, you have to probe from IRQ 0 to IRQ
NR_IRQS-1
, where NR_IRQS
is
defined in <asm/irq.h>
and is platform
dependent.
Now we are missing only the probing handler itself. The handler’s
role is to update short_irq
according to which
interrupts are actually received. A 0 value in
short_irq
means “nothing yet,” while a negative
value means “ambiguous.” These values were chosen to be consistent
with probe_irq_off and to allow the same code to
call either kind of probing within short.c
.
void short_probing(int irq, void *dev_id, struct pt_regs *regs) { if (short_irq == 0) short_irq = irq; /* found */ if (short_irq != irq) short_irq = -irq; /* ambiguous */ }
The arguments to the handler are described later. Knowing that
irq
is the interrupt being handled should be
sufficient to understand the function just shown.
Older versions of the Linux kernel took great pains to distinguish between “fast” and “slow” interrupts. Fast interrupts were those that could be handled very quickly, whereas handling slow interrupts took significantly longer. Slow interrupts could be sufficiently demanding of the processor that it was worthwhile to reenable interrupts while they were being handled. Otherwise, tasks requiring quick attention could be delayed for too long.
In modern kernels most of the differences between fast and slow
interrupts have disappeared. There remains only one: fast interrupts
(those that were requested with the SA_INTERRUPT
flag) are executed with all other interrupts disabled on the current
processor. Note that other processors can still handle interrupts,
though you will never see two processors handling the same IRQ at the
same time.
To summarize the slow and fast executing environments:
A fast handler runs with interrupt reporting disabled in the microprocessor, and the interrupt being serviced is disabled in the interrupt controller. The handler can nonetheless enable reporting in the processor by calling sti.
A slow handler runs with interrupt reporting enabled in the processor, and the interrupt being serviced is disabled in the interrupt controller.
So, which type of interrupt should your driver use? On modern
systems, SA_INTERRUPT
is only intended for use in a
few, specific situations (such as timer interrupts). Unless you have
a strong reason to run your interrupt handler with other interrupts
disabled, you should not use SA_INTERRUPT
.
This description should satisfy most readers, though someone with a taste for hardware and some experience with her computer might be interested in going deeper. If you don’t care about the internal details, you can skip to the next section.
This description has been extrapolated from
arch/i386/kernel/irq.c
,
arch/i386/kernel/i8259.c
, and
include/asm-i386/hw_irq.h
as they appear in the
2.4 kernels; although the general concepts remain
the same, the hardware details differ on other platforms.
The lowest level of interrupt handling resides in assembly code
declared as macros in hw_irq.h
and expanded in
i8259.c
. Each interrupt is connected to the
function do_IRQ, defined in
irq.c
.
The first thing do_IRQ does is to acknowledge the
interrupt so that the interrupt controller can go on to other things.
It then obtains a spinlock for the given IRQ number, thus preventing
any other CPU from handling this IRQ. It clears a couple of status
bits (including one called IRQ_WAITING
that we’ll
look at shortly), and then looks up the handler(s) for this particular
IRQ. If there is no handler, there’s nothing to do; the spinlock is
released, any pending tasklets and bottom halves are run, and
do_IRQ returns.
Usually, however, if a device is interrupting there is a handler registered as well. The function handle_IRQ_event is called to actually invoke the handlers. It starts by testing a global interrupt lock bit; if that bit is set, the processor will spin until it is cleared. Calling cli sets this bit, thus blocking handling of interrupts; the normal interrupt handling mechanism does not set this bit, and thus allows further processing of interrupts. If the handler is of the slow variety, interrupts are reenabled in the hardware and the handler is invoked. Then it’s just a matter of cleaning up, running tasklets and bottom halves, and getting back to regular work. The “regular work” may well have changed as a result of an interrupt (the handler could wake_up a process, for example), so the last thing that happens on return from an interrupt is a possible rescheduling of the processor.
Probing for IRQs is done by setting the IRQ_WAITING
status bit for each IRQ that currently lacks a handler. When the
interrupt happens, do_IRQ clears that bit and
then returns, since no handler is registered.
probe_irq_off, when called by a driver, need only
search for the IRQ that no longer has IRQ_WAITING
set.
Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.