As already mentioned, init_module registers any facility offered by the module. By facility, we mean a new functionality, be it a whole driver or a new software abstraction, that can be accessed by an application.
Modules can register many different types of facilities; for each facility, there is a specific kernel function that accomplishes this registration. The arguments passed to the kernel registration functions are usually a pointer to a data structure describing the new facility and the name of the facility being registered. The data structure usually embeds pointers to module functions, which is how functions in the module body get called.
The items that can be registered exceed the list of device types
mentioned in Chapter 1. They include serial ports,
miscellaneous devices, /proc
files, executable
domains, and line disciplines. Many of those registrable items
support functions that aren’t directly related to hardware but remain
in the “software abstractions” field. Those items can be registered
because they are integrated into the driver’s functionality anyway
(like /proc
files and line disciplines for
example).
There are other facilities that can be registered as add-ons for
certain drivers, but their use is so specific that it’s not worth
talking about them; they use the stacking technique, as described
earlier in Section 2.3. If you want to probe
further, you can grep for EXPORT_SYMBOL in the kernel sources and find
the entry points offered by different drivers. Most registration
functions are prefixed with register_
, so another
possible way to find them is to grep for register_
in /proc/ksyms
.
If any errors occur when you register utilities, you must undo any registration activities performed before the failure. An error can happen, for example, if there isn’t enough memory in the system to allocate a new data structure or because a resource being requested is already being used by other drivers. Though unlikely, it might happen, and good program code must be prepared to handle this event.
Linux doesn’t keep a per-module registry of facilities that have been registered, so the module must back out of everything itself if init_module fails at some point. If you ever fail to unregister what you obtained, the kernel is left in an unstable state: you can’t register your facilities again by reloading the module because they will appear to be busy, and you can’t unregister them because you’d need the same pointer you used to register and you’re not likely to be able to figure out the address. Recovery from such situations is tricky, and you’ll be often forced to reboot in order to be able to load a newer revision of your module.
Error recovery is sometimes best handled with the
goto
statement. We normally hate to use
goto
, but in our opinion this is one situation
(well, the only situation) where it is useful. In
the kernel, goto
is often used as shown here to
deal with errors.
The following sample code (using fictitious registration and unregistration functions) behaves correctly if initialization fails at any point.
int init_module(void) { int err; /* registration takes a pointer and a name */ err = register_this(ptr1, "skull"); if (err) goto fail_this; err = register_that(ptr2, "skull"); if (err) goto fail_that; err = register_those(ptr3, "skull"); if (err) goto fail_those; return 0; /* success */ fail_those: unregister_that(ptr2, "skull"); fail_that: unregister_this(ptr1, "skull"); fail_this: return err; /* propagate the error */ }
This code attempts to register three (fictitious) facilities. The
goto
statement is used in case of failure to cause
the unregistration of only the facilities that had been successfully
registered before things went bad.
Another option, requiring no hairy goto
statements,
is keeping track of what has been successfully registered and calling
cleanup_module in case of any error. The cleanup
function will only unroll the steps that have been successfully
accomplished. This alternative, however, requires more code and more
CPU time, so in fast paths you’ll still resort to
goto
as the best error-recovery tool. The return
value of init_module, err
, is
an error code. In the Linux kernel, error codes are negative numbers
belonging to the set defined in
<linux/errno.h>
. If you want to generate your
own error codes instead of returning what you get from other
functions, you should include <linux/errno.h>
in order to use symbolic values such as -ENODEV
,
-ENOMEM
, and so on. It is always good practice to
return appropriate error codes, because user programs can turn them to
meaningful strings using perror or similar
means. (However, it’s interesting to note that several versions of
modutils returned a “Device busy” message
for any error returned by init_module; the
problem has only been fixed in recent releases.)
Obviously, cleanup_module must undo any registration performed by init_module, and it is customary (but not mandatory) to unregister facilities in the reverse order used to register them:
void cleanup_module(void) { unregister_those(ptr3, "skull"); unregister_that(ptr2, "skull"); unregister_this(ptr1, "skull"); return; }
If your initialization and cleanup are more complex than dealing with
a few items, the goto
approach may become difficult
to manage, because all the cleanup code must be repeated within
init_module, with several labels
intermixed. Sometimes, therefore, a different layout of the code
proves more successful.
What you’d do to minimize code duplication and keep everything streamlined is to call cleanup_module from within init_module whenever an error occurs. The cleanup function, then, must check the status of each item before undoing its registration. In its simplest form, the code looks like the following:
struct something *item1; struct somethingelse *item2; int stuff_ok; void cleanup_module(void) { if (item1) release_thing(item1); if (item2) release_thing2(item2); if (stuff_ok) unregister_stuff(); return; } int init_module(void) { int err = -ENOMEM; item1 = allocate_thing(arguments); item2 = allocate_thing2(arguments2); if (!item2 || !item2) goto fail; err = register_stuff(item1, item2); if (!err) stuff_ok = 1; else goto fail; return 0; /* success */ fail: cleanup_module(); return err; }
As shown in this code, you may or may not need external flags to mark success of the initialization step, depending on the semantics of the registration/allocation function you call. Whether or not flags are needed, this kind of initialization scales well to a large number of items and is often better than the technique shown earlier.
The system keeps a usage count for every module in order to determine whether the module can be safely removed. The system needs this information because a module can’t be unloaded if it is busy: you can’t remove a filesystem type while the filesystem is mounted, and you can’t drop a char device while a process is using it, or you’ll experience some sort of segmentation fault or kernel panic when wild pointers get dereferenced.
In modern kernels, the system can automatically track the usage count for you, using a mechanism that we will see in the next chapter. There are still times, however, when you will need to adjust the usage count manually. Code that must be portable to older kernels must still use manual usage count maintenance as well. To work with the usage count, use these three macros:
The macros are defined in <linux/module.h>
,
and they act on internal data structures that shouldn’t be accessed
directly by the programmer. The internals of module management changed
a lot during 2.1 development and were completely rewritten in 2.1.18,
but the use of these macros did not change.
Note that there’s no need to check for MOD_IN_USE
from within cleanup_module, because the check is
performed by the system call sys_delete_module
(defined in kernel/module.c
) in advance.
Proper management of the module usage count is critical for system
stability. Remember that the kernel can decide to try to unload your
module at absolutely any time. A common module programming error is to
start a series of operations (in response, say, to an
open request) and increment the usage count at
the end. If the kernel unloads the module halfway through those
operations, chaos is ensured. To avoid this kind of problem, you
should call MOD_INC_USE_COUNT
before doing almost anything else in a module.
You won’t be able to unload a module if you lose track of the usage
count. This situation may very well happen during development, so you
should keep it in mind. For example, if a process gets destroyed
because your driver dereferenced a NULL pointer, the driver won’t be
able to close the device, and the usage count won’t fall back to zero.
One possible solution is to completely disable the usage count during
the debugging cycle by redefining both
MOD_INC_USE_COUNT
and
MOD_DEC_USE_COUNT
to no-ops. Another solution is to
use some other method to force the counter to zero (you’ll see this
done in Section 5.1.4 in Chapter 5). Sanity checks should never be circumvented in a
production module. For debugging, however, sometimes a brute-force
attitude helps save development time and is therefore acceptable.
The current value of the usage count is found in the third field of
each entry in /proc/modules
. This file shows the modules
currently loaded in the system, with one entry for each module. The
fields are the name of the module, the number of bytes of memory it
uses, and the current usage count. This is a typical
/proc/modules
file:
parport_pc 7604 1 (autoclean) lp 4800 0 (unused) parport 8084 1 [parport_probe parport_pc lp] lockd 33256 1 (autoclean) sunrpc 56612 1 (autoclean) [lockd] ds 6252 1 i82365 22304 1 pcmcia_core 41280 0 [ds i82365]
Here we see several modules in the system. Among other things, the
parallel port modules have been loaded in a stacked manner, as we saw
in Figure 2-2. The (autoclean)
marker identifies modules managed by kmod
or kerneld (see Chapter 11);
the (unused)
marker means exactly that. Other flags
exist as well. In Linux 2.0, the second (size) field was expressed in
pages (4 KB each on most platforms) rather than bytes.
To unload a module, use the rmmod command. Its task is much simpler than loading, since no linking has to be performed. The command invokes the delete_module system call, which calls cleanup_module in the module itself if the usage count is zero or returns an error otherwise.
The cleanup_module implementation is in charge of unregistering every item that was registered by the module. Only the exported symbols are removed automatically.
As we have seen, the kernel calls init_module to initialize a newly loaded module, and calls cleanup_module just before module removal. In modern kernels, however, these functions often have different names. As of kernel 2.3.13, a facility exists for explicitly naming the module initialization and cleanup routines; using this facility is the preferred programming style.
Consider an example. If your module names its initialization routine my_init (instead of init_module) and its cleanup routine my_cleanup, you would mark them with the following two lines (usually at the end of the source file):
module_init(my_init); module_exit(my_cleanup);
Note that your code must include
<linux/init.h>
to use
module_init and
module_exit.
The advantage of doing things this way is that each initialization and cleanup function in the kernel can have a unique name, which helps with debugging. These functions also make life easier for those writing drivers that work either as a module or built directly into the kernel. However, use of module_init and module_exit is not required if your initialization and cleanup functions use the old names. In fact, for modules, the only thing they do is define init_module and cleanup_module as new names for the given functions.
If you dig through the kernel source (in versions 2.2 and later), you will likely see a slightly different form of declaration for module initialization and cleanup functions, which looks like the following:
static int __init my_init(void) { .... } static void __exit my_cleanup(void) { .... }
The attribute __init
, when used in this
way, will cause the initialization function to be discarded, and its
memory reclaimed, after initialization is complete. It only works,
however, for built-in drivers; it has no effect on
modules. __exit
, instead, causes the
omission of the marked function when the driver is not built as a
module; again, in modules, it has no effect.
The use of __init
(and
__initdata
for data items) can reduce the
amount of memory used by the kernel. There is no harm in marking
module initialization functions with
__init
, even though currently there is no
benefit either. Management of initialization sections has not been
implemented yet for modules, but it’s a possible enhancement for the
future.
Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.