Linux kernel 笔记 (3) ——page和zone

MMU(Memory Management Unit)用来管理内存的最小单位是一个物理页面(physical page,在32-bit系统上通常每个页面为4k64-bit系统上为8k)。Linux kernel使用struct page结构体(定义在<linux/mm_types.h>)来代表每个物理页面:

struct page {
    /* First double word block */
    unsigned long flags;        /* Atomic flags, some possibly
                     * updated asynchronously */
    struct address_space *mapping;  /* If low bit clear, points to
                     * inode address_space, or NULL.
                     * If page mapped as anonymous
                     * memory, low bit is set, and
                     * it points to anon_vma object:
                     * see PAGE_MAPPING_ANON below.
                     */
    ......
}  

受制于一些硬件限制,有些物理页面不能用来进行某种操作。所以kernel把页面分成不同的zone
ZONE_DMA:用来进行DMA操作的页面;
ZONE_DMA32:也是用来进行DMA操作的页面,不过仅针对32-bit设备;
ZONE_NORMAL:包含常规的,用来映射的页面;
ZONE_HIGHMEM:包含不能永久地映射到kernel的地址空间(address space)的页面。
还有其它的zone定义(例如:ZONE_MOVABLE)。zone_type定义在 <linux/mmzone.h>文件里:

enum zone_type {
#ifdef CONFIG_ZONE_DMA
    /*
     * ZONE_DMA is used when there are devices that are not able
     * to do DMA to all of addressable memory (ZONE_NORMAL). Then we
     * carve out the portion of memory that is needed for these devices.
     * The range is arch specific.
     *
     * Some examples
     *
     * Architecture     Limit
     * ---------------------------
     * parisc, ia64, sparc  <4G
     * s390         <2G
     * arm          Various
     * alpha        Unlimited or 0-16MB.
     *
     * i386, x86_64 and multiple other arches
     *          <16M.
     */
    ZONE_DMA,
#endif
#ifdef CONFIG_ZONE_DMA32
    /*
     * x86_64 needs two ZONE_DMAs because it supports devices that are
     * only able to do DMA to the lower 16M but also 32 bit devices that
     * can only do DMA areas below 4G.
     */
    ZONE_DMA32,
#endif
    /*
     * Normal addressable memory is in ZONE_NORMAL. DMA operations can be
     * performed on pages in ZONE_NORMAL if the DMA devices support
     * transfers to all addressable memory.
     */
    ZONE_NORMAL,
#ifdef CONFIG_HIGHMEM
    /*
     * A memory area that is only addressable by the kernel through
     * mapping portions into its own address space. This is for example
     * used by i386 to allow the kernel to address the memory beyond
     * 900MB. The kernel will set up special mappings (page
     * table entries on i386) for each page that the kernel needs to
     * access.
     */
    ZONE_HIGHMEM,
#endif
    ZONE_MOVABLE,
    __MAX_NR_ZONES
};  

如果CPU体系结构允许DMA操作访问任何地址空间,那就没有ZONE_DMA。同理,Intel X86_64处理器可以访问所有的内存空间,也就不存在ZONE_HIGHMEM。 关于“High memory”,可以参考这篇文章

Linux kernel 笔记 (2) ——kernel代码的一些特性

Linux kernel代码相对于user-space程序的一些特性:
(1)kernel代码不能访问标准C程序库(libc)和头文件(例如打印使用printk,而不是printf);
(2)kernel代码是用GNU C实现的,不是ANSI C(比如:内联函数,内联汇编,分支预测等等);
(3)kernel程序没有内存保护;
(4)避免在kernel代码中引入浮点运算;
(5)kernel程序为每个进程预分配的堆栈空间很小;
(6)因为kernel程序是可抢占的(preemptive),支持symmetrical multiprocessing (SMP)和异步中断(asynchronous interrupt),所以开发kernel程序时要时刻考虑并发和同步问题;
(7)可移植性对于kernel代码非常重要。

Linux kernel 笔记 (1) ——CPU在做什么?

In fact, in Linux, we can generalize that each processor is doing exactly one of three things at any given moment:
a) In user-space, executing user code in a process
b) In kernel-space, in process context, executing on behalf of a specific process
c) In kernel-space, in interrupt context, not associated with a process, handling an
interrupt

Linux中,任何时候,CPU都在做下面三件事中的一件:

a)运行进程的用户空间代码;
b)运行进程的内核空间代码;
c)处理中断(也是工作在内核空间,但不与任何进程关联)。