Most exceptions issued by the CPU are
interpreted by Linux as error conditions. When one of them occurs,
the kernel sends a signal to the process that caused the exception to
notify it of an anomalous condition. If, for instance, a process
performs a division by zero, the CPU raises a
“Divide error” exception and the
corresponding exception handler sends a SIGFPE
signal to the current process, which then takes the necessary steps
to recover or (if no signal handler is set for that signal) abort.
There are a couple of cases, however, where Linux exploits CPU
exceptions to manage hardware resources more efficiently. A first
case is already described in section Section 3.3.4.
The “Device not available”
exception is used together with the TS flag of the
cr0
register to force the kernel to load the
floating point registers of the CPU with new values. A second case
refers to the Page Fault exception, which is used to defer allocating
new page frames to the process until the last possible moment. The
corresponding handler is complex because the exception may, or may
not, denote an error condition (see Section 8.4).
Exception handlers have a standard structure consisting of three parts:
Save the contents of most registers in the Kernel Mode stack (this part is coded in assembly language).
Handle the exception by means of a high-level C function.
Exit from the handler by means of the
ret_from_exception( )
function.
To take advantage of exceptions, the IDT must be properly initialized
with an exception handler function for each recognized exception. It
is the job of the trap_init( )
function to insert
the final values—the functions that handle the
exceptions—into all IDT entries that refer to nonmaskable
interrupts and exceptions. This is accomplished through the
set_trap_gate
, set_intr_gate
,
and set_system_gate
macros:
set_trap_gate(0,÷_error); set_trap_gate(1,&debug); set_intr_gate(2,&nmi); set_system_gate(3,&int3); set_system_gate(4,&overflow); set_system_gate(5,&bounds); set_trap_gate(6,&invalid_op); set_trap_gate(7,&device_not_available); set_trap_gate(8,&double_fault); set_trap_gate(9,&coprocessor_segment_overrun); set_trap_gate(10,&invalid_TSS); set_trap_gate(11,&segment_not_present); set_trap_gate(12,&stack_segment); set_trap_gate(13,&general_protection); set_intr_gate(14,&page_fault); set_trap_gate(16,&coprocessor_error); set_trap_gate(17,&alignment_check); set_trap_gate(18,&machine_check); set_trap_gate(19,&simd_coprocessor_error); set_system_gate(128,&system_call);
Now we will look at what a typical exception handler does once it is invoked.
Let’s use
handler_name
to denote the name of a generic
exception handler. (The actual names of all the exception handlers
appear on the list of macros in the previous section.) Each exception
handler starts with the following assembly language instructions:
handler_name:
pushl $0 /* only for some exceptions */
pushl $do_handler_name
jmp error_code
If the control unit is not supposed to automatically insert a
hardware error code on the stack when the exception occurs, the
corresponding assembly language fragment includes a pushl $0
instruction to pad the stack with a null value. Then the
address of the high-level C function is pushed on the stack; its name
consists of the exception handler name prefixed by
do_
.
The assembly language fragment labeled as
error_code
is the same for all exception handlers
except the one for the “Device not
available” exception (see Section 3.3.4). The code performs the following steps:
Saves the registers that might be used by the high-level C function on the stack.
Issues a
cld
instruction to clear the direction flagDF
ofeflags
, thus making sure that autoincrements on theedi
andesi
registers will be used with string instructions.[27]Copies the hardware error code saved in the stack at location
esp+36
ineax
. Stores the value -1 in the same stack location. As we shall see in Section 10.3.4, this value is used to separate0x80
exceptions from other exceptions.Loads
edi
with the address of the high-leveldo_handler_name( )
C function saved in the stack at locationesp+32
; writes the contents ofes
in that stack location.Loads the kernel data Segment Selector into the
ds
andes
registers, then sets theebx
register to the address of the current process descriptor (see Section 3.2.2).Stores the parameters to be passed to the high-level C function on the stack, namely, the exception hardware error code and the address of the stack location where the contents of User Mode registers is saved.
Invokes the high-level C function whose address is now stored in
edi
.
After the last step is executed, the invoked function finds the following on the top locations of the stack:
The return address of the instruction to be executed after the C function terminates (see the next section)
The stack address of the saved User Mode registers
The hardware error code
As already explained, the names of the
C functions that implement exception handlers always consist of the
prefix do_
followed by the handler name. Most of
these functions store the hardware error code and the exception
vector in the process descriptor of current
, and
then send a suitable signal to that process. This is done as follows:
current->tss.error_code = error_code; current->tss.trap_no = vector; force_sig(sig_number, current);
The current process takes care of the signal right after the termination of exception handler. The signal will be handled either in User Mode by the process’s own signal handler (if it exists) or in Kernel Mode. In the latter case, the kernel usually kills the process (see Chapter 10). The signals sent by the exception handlers are already in Table 4-1.
The exception handler always checks whether the exception occurred in
User Mode or in Kernel Mode and, in the latter case, whether it was
due to an invalid argument of a system call. We’ll
describe in Section 9.2.6 how the kernel
defends itself against invalid arguments of system calls. Any other
exception raised in Kernel Mode is due to a kernel bug. In this case,
the exception handler knows the kernel is misbehaving and, in order
to avoid data corruption on the hard disks, the handler invokes the
die( )
function, which prints the contents of all
CPU registers on the console (this dump is called kernel oops
)
and terminates the current
process by calling
do_exit( )
(see Chapter 20).
When the C function that implements the exception handling terminates, control is transferred to the following assembly language fragment:
addl $8, %esp jmp ret_from_exception
The code pops the stack address of the saved User Mode registers and
the hardware error code from the stack, and then performs a
jmp
instruction to the
ret_from_exception( )
function. This function is
described in the later section Section 4.8.
[27] A single assembly language
“string instruction,” such as
rep;movsb
, is able to act on a whole block of data
(string).
Get Understanding the Linux Kernel, Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.