14.2. The Page Cache
The page cache, which is thankfully much simpler than the buffer cache, is a disk cache for the data accessed by page I/O operations. As we shall see in Chapter 15, all access to regular files made by read( ), write( ), and mmap( ) system calls is done through the page cache. Of course, the unit of information kept in the cache is a whole page, since page I/O operations transfer whole pages of data. A page does not necessarily contain physically adjacent disk blocks, and it cannot thus be identified by a device number and a block number. Instead, a page in the page cache is identified by a file's inode and by the offset within the file.
There are three main activities related to the page cache: adding a page when accessing a file portion not already in the cache, removing a page when the cache gets too big, and finding the page including a given file offset.
14.2.1. Page Cache Data Structures
The page cache makes use of two main data structures:
A page hash table
Lets the kernel quickly derive the page descriptor address for the page associated with a specified inode and file offset
An inode queue
A list of page descriptors corresponding to pages of data of a particular file (distinguished by a unique inode)
Manipulation of the page cache involves adding and removing entries from these data structures, as well as updating the fields in all inode objects referencing cached files.
14.2.1.1. The page hash table
When a process reads a large file, the page ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access