Understanding the Linux Kernel, 2nd Edition by Marco Cesati, Daniel P. Bovet The unconfirmed error reports are from readers. They have not yet been approved or disproved by the author or editor and represent solely the opinion of the reader. Here's a key to the markup: [page-number]: serious technical mistake {page-number}: minor technical mistake : important language/formatting problem (page-number): language change or minor formatting problem ?page-number?: reader question or request for clarification This page was updated September 23, 2004. UNCONFIRMED errors and comments from readers: {39} 1st bullet of the section "Segmentation Unit"; should "linear address" in the two places, be replaced with "physical address" ? [43] Figure 2.5; USER CS should be 0x23 not 0x20 USER DATA/DS should be 0x2B not 0x28 {48-49} Last paragraph; "The addresses start with a 2 followed by zeros, so the 10 bits all have the same value, namely 0x080 or 128 decimal." It seems to me that the value extracted from the Directory field should be 0x200 or 512 decimal instead of 0x080. [60] pgd_alloc(m) list item; The get_pgd_slow function is not used by the x86 architecture. (65) 2nd line from the top; Replace "PWD" with "PWT" (Page Write-Through). {65} Second paragraph under the heading "Final kernel Page Table when RAM is less than 896 MB"; "_pa" should be "_pa()" and "_va" should be "_va()" (71) Second bullet from the bottom; The word "Until" should be changed to "While" [89] top; the picture is not right. According to the code in linux kernel 2.2.18 which I have been reading, the next pointer of the last wait_queue entry should be pointing to the (struct wait_queue *)(q-1), not (struct wait_queue *)(*q-1) as shown in the picture. and please check the code. linux2.2.18/include/linux/wait.h line 21 : #define WAIT_QUEUE_HEAD(x) ((struct wait_queue *)((x)-1)) line 27 : static inline void init_waitqueue(struct wait_queue **q) { *q = WAIT_QUEUE_HEAD(q); } [97] 5th paragraph; In the explanation of the saving and restoring FPU/MMX/SSE registers on a process switch, the paragraph says that the registers are saved/restored using the TSS of the process. In 2.4, there is only a TSS for each CPU, not every process. Replace the two mentions of "TSS" with "process descriptor". (101) 3rd paragraph from the bottom; IN PRINT: "whose first parameter specifies a SIGCHLD signal... ...and whose second parameter is equal to 0." SHOULD BE: "whose flags parameter specifies a SIGCHLD signal... ... and whose child_stack parameter is equal to 0." (127) 5th paragraph from bottom (source snippet); the text in the Book reads current->tss.error_code = error_code; current->tss.trap_no = 13; force_sig(sig_number, current); but it must be current->thread.error_code = error_code; current->thread.trap_no = 13; force_sig(sig_number, current); (word "tss" replaced by "thread") (see file arch/i386/kernel/traps.c, Line 408) (148) 3rd bullet item; IN PRINT: "ksoftiqd_CPUn" SHOULD BE: "ksoftirqd_CPUn" {169} 3rd paragraph; I think the phrase: "If it is clear, it means that the spin lock was set to 1 (unlocked), so normal execution continues at label 3...." should read: "If it is clear, it means that the spin lock was set to 0 (locked), so normal execution continues at label 3...." The tricky thing is that the authors have rewritten the actual assembly language code and changed the instruction "js 2f" to "jns 3f", and they also refer to the value of 0 as the "locked" setting, whereas in fact any value less than 1 functions as a "locked" setting for the spin lock. In the actual implementation (from the 2.4.23 sources), the lock value may be less than 0, if several CPUs attempt to obtain the lock at the same time. The "lock" prefix forces them to execute the "decb" instruction in sequence; the first one in line will have its sign flag cleared because it decremented the value to 0, while any others will have their sign flag set because they decremented the value below 0. Thus the subsequent instructions must take this into account, i.e., the "js" (or "jns" in the authors' version), but also "jle" rather than "jz". For a good explanation, see: http://mail.nl.linux.org/kernelnewbies/2002-07/msg00048.html {180} last sentence of "Local Interrupt Disabling; The sentence states that "Recent kernel versions also define the local_irq_save() [macro, which is] essentially equivalent to __save_flags() [...]" At least in linux-2.4.25 it is not (and I suspect it never was). Instead it does both __save_flags() _and_ __cli(). To quote from include/asm-i386/system.h: | #define __save_and_cli(x) do { __save_flags(x); __cli(); } while(0); | #define __save_and_sti(x) do { __save_flags(x); __sti(); } while(0); [...] | #define local_irq_save(x) __save_and_cli(x) | #define local_irq_set(x) __save_and_sti(x) {181} code snippet; replace "if (!local_irq_count[smp_processor_id()])" with "if (!local_irq_count(smp_processor_id()))" {272} 4th paragraph; Replace "3, 4, and 7" with "3, 5, and 7" The color of node 10 is also changed although the author's probably count that as part of the tree rotation as that node becomes the new root and rule #2 says the root must be black. [286] Figure 8-5 ; replace from YES <-Write access -> NO with NO <-Write access -> YES * reference :do_page_fault() function. {363} 15th line from the bottom; The sentence "The weight is given by p->counter + 1000" should be replaced by "The weight is given by p->rt_priority + 1000" Actually this is the right expression in the source code of Linux kernel ver. 2.4.18. p->rt_priority field is statically defined, whereas p->counter is dinamically decremented. In this way a real-time process' priority would be calculated dinamically instead of statically. {382} Table 12-3, line 14; In Table 12-3 (a description of the fields of the inode object), the i_size field is incorrectly reported as type off_t. In the kernel sources for 2.4.18 (under include/linux/fs.h), the i_size field of struct inode is actually of type loff_t (which makes a huge differences, for example, on x86). (398) Table 12-11; in description for mnt_mounts and mnt_child "parent list of descriptors" should be "child list of descriptors" {404} 28th line from bottom; "0xce0d" should be "0xc0ed" {408} Table 12-14; "struct vfs_mount *" should be "struct vfsmount *" {422} 4th paragraph; There are no members named 'fl_nextlink' or 'fl_prevlink' in the file_lock structure. There is 'fl_next' which is points to the next file_lock in a singly-linked list, and 'fl_link' which is a list_head of file_locks. {452} 3rd line from the top; do_open(inode->i_bdev, filp); should be: do_open(inode->i_bdev, inode, filp); {502} 7th line from the top; "finds the buffers" should be "finds the blocks" {503} 5th line from the bottom; "the number of the first block" should be "the file block number of the first block" (505) 16th line from top; "Maximum number of characters..." should be: "Maximum number of pages..." {510} entry 6 line 2: inode->mtime should be: inode->ctime {510} 14th line from the top; "inode->mtime" should be "inode->i_mtime" {510} 15th line from the top; "inode->mtime" should be "inode->i_ctime" {512} 7th line from the bottom; "block_commit_write" should be "__block_commit_write" {513} 11th~12th line from the bottom; "block_commit_write" should be "generic_commit_write" {578} 2nd paragraph from top; formula "s/(8xb)" should be "s/(8xbxb)" {586} 3rd line from the bottom; "shrink_mmap" should be "shrink_caches" {587} 7th line from the top; "s_inode_bitmap_number" should be "s_inode_bitmap" {587} 6th line from the top; "s_inode_bitmap" should be "s_inode_bitmap_number" {598} 10th line from the bottom; "ext2_getblk()" should be "ext2_get_block()" [625] Paragraph after numeral "c.", third line.; inet_sendmsg() should be udp_sendmsg() {654} par "To avoid" line 5: msgmnb should be msgmax {654} par "To avoid" line 6: msgmax should be msgmnb (691) 7th line; "kmem_init()" should be "kmap_init()" [751] "current working directory"; The index entry for "current working directory" mentions page 13. It should also mention page 392. Actually something seems to be missing from pages 392 to 393 as well. And I'm not sure if something is missing from the actual kernel as well. I haven't read the source code associated with this yet. But after an episode where a Linux system hanged, I am guessing that there is some inadequacy in the way the kernel keeps track of the current working directory of a process, likely in the chdir() system call. Even if the book's index entry gets fixed to point to page 392, I wonder if something is missing from table 12-7.