If a module needs to allocate big chunks of memory, it is usually better to use a page-oriented technique. Requesting whole pages also has other advantages, which will be introduced later, in Section 13.2 in Chapter 13.
To allocate pages, the following functions are available:
- get_zeroed_page
Returns a pointer to a new page and fills the page with zeros.
- __get_free_page
- __get_free_pages
Allocates and returns a pointer to the first byte of a memory area that is several (physically contiguous) pages long, but doesn’t zero the area.
- __get_dma_pages
Similar to get_free_pages, but guarantees that the allocated memory is DMA capable. If you use version 2.2 or later of the kernel, you can simply use __get_free_pages and pass the
__GFP_DMA
flag; if you want backward compatibility with 2.0, you need to call this function instead.
The prototypes for the functions follow:
unsigned long get_zeroed_page(int flags); unsigned long __get_free_page(int flags); unsigned long __get_free_pages(int flags, unsigned long order); unsigned long __get_dma_pages(int flags, unsigned long order);
The flags
argument works in the same way as with
kmalloc; usually either
GFP_KERNEL
or GFP_ATOMIC
is
used, perhaps with the addition of the
__GFP_DMA
flag (for memory that can be used
for direct memory access operations) or
__GFP_HIGHMEM
when high memory can be used.
order
is the base-two logarithm of the number of
pages you are requesting or freeing (i.e.,
log2
N). For example,
order
is 0
if you want one page
and 3
if you request eight pages. If
order
is too big (no contiguous area of that size
is available), the page allocation will fail. The maximum value of
order
was 5 in Linux 2.0 (corresponding to 32
pages) and 9 with later versions (corresponding to 512 pages: 2 MB on
most platforms). Anyway, the bigger order
is, the
more likely it is that the allocation will fail.
When a program is done with the pages, it can free them with one of the following functions. The first function is a macro that falls back on the second:
void free_page(unsigned long addr); void free_pages(unsigned long addr, unsigned long order);
If you try to free a different number of pages than you allocated, the memory map will become corrupted and the system will get in trouble at a later time.
It’s worth stressing that get_free_pages and the
other functions can be called at any time, subject to the same rules
we saw for kmalloc. The functions can fail to
allocate memory in certain circumstances, particularly when
GFP_ATOMIC
is used. Therefore, the program calling
these allocation functions must be prepared to handle an allocation
failure.
It has been said that if you want to live dangerously, you can assume
that neither kmalloc nor the underlying
get_free_pages will ever fail when called with a
priority of GFP_KERNEL
. This is
almost true, but not completely: small,
memory-limited systems can still run into trouble. A driver writer
ignores the possibility of allocation failures at his or her peril (or
that of his or her users).
Although kmalloc(GFP_KERNEL)
sometimes fails when
there is no available memory, the kernel does its best to fulfill
allocation requests. Therefore, it’s easy to degrade system
responsiveness by allocating too much memory. For example, you can
bring the computer down by pushing too much data into a
scull device; the system will start
crawling while it tries to swap out as much as possible in order to
fulfill the kmalloc request. Since every resource
is being sucked up by the growing device, the computer is soon
rendered unusable; at that point you can no longer even start a new
process to try to deal with the problem. We don’t address this issue
in scull, since it is just a sample module
and not a real tool to put into a multiuser system. As a programmer,
you must nonetheless be careful, because a module is privileged code
and can open new security holes in the system (the most likely is a
denial-of-service hole like the one just outlined).
In order to test page allocation for real, the scullp module is released together with other sample code. It is a reduced scull, just like scullc introduced earlier.
Memory quanta allocated by scullp are whole
pages or page sets: the scullp_order
variable
defaults to 0 and can be specified at either compile time or load
time.
The following lines show how it allocates memory:
/* Here's the allocation of a single quantum */ if (!dptr->data[s_pos]) { dptr->data[s_pos] = (void *)__get_free_pages(GFP_KERNEL, dptr->order); if (!dptr->data[s_pos]) goto nomem; memset(dptr->data[s_pos], 0, PAGE_SIZE << dptr->order); }
The code to deallocate memory in scullp, instead, looks like this:
/* This code frees a whole quantum set */ for (i = 0; i < qset; i++) if (dptr->data[i]) free_pages((unsigned long)(dptr->data[i]), dptr->order);
At the user level, the perceived difference is primarily a speed
improvement and better memory use because there is no internal
fragmentation of memory. We ran some tests copying four megabytes
from scull0
to scull1
and
then from scullp0
to
scullp1
; the results showed a slight improvement
in kernel-space processor usage.
The performance improvement is not dramatic, because kmalloc is designed to be fast. The main advantage of page-level allocation isn’t actually speed, but rather more efficient memory usage. Allocating by pages wastes no memory, whereas using kmalloc wastes an unpredictable amount of memory because of allocation granularity.
But the biggest advantage of __get_free_page is that the page is completely yours, and you could, in theory, assemble the pages into a linear area by appropriate tweaking of the page tables. For example, you can allow a user process to mmap memory areas obtained as single unrelated pages. We’ll discuss this kind of operation in Section 13.2 in Chapter 13, where we show how scullp offers memory mapping, something that scull cannot offer.
Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.