FreeBSD操作系统上获取CPU信息

FreeBSD既没有GNU/Linux操作系统上的/proc/cpuinfo文件,也不提供lscpu命令(其实lscpu也是访问的/proc/cpuinfo文件)。因此在FreeBSD上想了解当前机器的CPU信息,需要费点小周折:

(1)使用sysctl命令:

# sysctl hw.model hw.machine hw.ncpu
hw.model: Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz
hw.machine: amd64
hw.ncpu: 2

(2)读取/var/run/dmesg.boot文件:

# grep -i cpu /var/run/dmesg.boot
CPU: Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz (2400.05-MHz K8-class CPU)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
SMP: AP CPU #1 Launched!

(3)通过dmidecode命令获得CPUcache信息:

# dmidecode -t processor -t cache
# dmidecode 3.0
Scanning /dev/mem for entry point.
SMBIOS 2.4 present.

Handle 0x0004, DMI type 4, 35 bytes
Processor Information
        Socket Designation: LGA 775
        Type: Central Processor
        Family: Pentium 4
        Manufacturer: Intel
        ID: F6 06 00 00 FF FB EB BF
        Signature: Type 0, Family 6, Model 15, Stepping 6
        Flags:
                FPU (Floating-point unit on-chip)
                VME (Virtual mode extension)
                DE (Debugging extension)
                PSE (Page size extension)
......
Handle 0x0005, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L1-Cache
        Configuration: Enabled, Not Socketed, Level 1
        Operational Mode: Write Back
        Location: Internal
        Installed Size: 32 kB
        Maximum Size: 32 kB
......

参考资料:
FreeBSD CPU Information Command
What is the equivalent of /proc/cpuinfo on FreeBSD v8.1?

FreeBSD kernel 笔记(13)——delaying execution

4delaying execution方法(选自:FreeBSD Device Drivers):

Sleeping Sleeping is done when you must wait for something to occur before you can proceed.
Event Handlers Event handlers let you register one or more functions to be executed when an event occurs.
Callouts Callouts let you perform asynchronous code execution. Callouts are used to execute your functions at a specific time.
Taskqueues Taskqueues also let you perform asynchronous code execution. Taskqueues are used for deferred work.

FreeBSD kernel 笔记(11)——condition variables

线程同步除了使用mutex,还可以使用conditional variables(下列内容摘自FreeBSD Device Drivers):

Condition variables synchronize the execution of two or more threads based upon the value of an object. In contrast, locks synchronize threads by controlling their access to objects.

Condition variables are used in conjunction with locks to “block” threads until a condition is true. It works like this: A thread first acquires the foo lock. Then it examines the condition. If the condition is false, it sleeps on the bar condition variable. While asleep on bar , threads relinquish foo . A thread that causes the condition to be true wakes up the threads sleeping on bar . Threads woken up in this manner reacquire foo before proceeding.

此外,使用conditional variables必然涉及到lock,以下是关于lock的规则(下列内容摘自FreeBSD Kernel Developer’s Manual):

The lock argument is a pointer to either mutex(9), rwlock(9), or sx(9) lock. A mutex(9) argument must be initialized with MTX_DEF and not MTX_SPIN. A thread must hold lock before calling cvwait(), cvwaitsig(), cvwaitunlock(), cvtimedwait(), or cvtimedwaitsig(). When a thread waits on a condition, lock is atomically released before the thread is blocked, then reacquired before the function call returns. In addition, the thread will fully drop the Giant mutex (even if recursed) while the it is suspended and will reacquire the Giant mutex before the function returns. The cvwaitunlock() function does not reacquire the lock before returning. Note that the Giant mutex may be specified as lock. However, Giant may not be used as lock for the cvwaitunlock() function. All waiters must pass the same lock in con- junction with cvp.

简而言之,即线程在调用cv_wait()等系列函数检查condition变成true时,它必须已经获得lock。在cv_wait()中,线程会先释放lock,然后阻塞在这里等待condition变成true,在从cv_wait()返回后,又重新获得lock。要注意,cv_wait_unlock()函数返回是不会重新获得lock

FreeBSD kernel 笔记(10)——mutex

FreeBSD kernel提供两种mutexspin mutexsleep mutex(下列内容摘自 FreeBSD Device Drivers):

Spin Mutexes
Spin mutexes are simple spin locks. If a thread attempts to acquire a spin lock that is being held by another thread, it will “spin” and wait for the lock to be released. Spin, in this case, means to loop infinitely on the CPU. This spinning can result in deadlock if a thread that is holding a spin lock is interrupted or if it context switches, and all subsequent threads attempt to acquire that lock. Consequently, while holding a spin mutex all interrupts are blocked on the local processor and a context switch cannot be performed.

Spin mutexes should be held only for short periods of time and should be used only to protect objects related to nonpreemptive interrupts and low- level scheduling code (McKusick and Neville-Neil, 2005). Ordinarily, you’ll never use spin mutexes.

Sleep Mutexes
Sleep mutexes are the most commonly used lock. If a thread attempts to acquire a sleep mutex that is being held by another thread, it will context switch (that is, sleep) and wait for the mutex to be released. Because of this behavior, sleep mutexes are not susceptible to the deadlock described above.

Sleep mutexes support priority propagation. When a thread sleeps on a sleep mutex and its priority is higher than the sleep mutex’s current owner, the current owner will inherit the priority of this thread (Baldwin, 2002). This characteristic prevents a lower priority thread from blocking a higher priority thread.

NOTE Sleeping (for example, calling a *sleep function) while holding a mutex is never safe and must be avoided; otherwise, there are numerous assertions that will fail and the kernel will panic.

使用spin mutex时,为了防止deadlock,要把local cpu关中断并且不能进行context switch。通常情况下,应该使用sleep mutex。另外要注意,获得mutex的线程不能sleep,否则会导致kernel panic

此外,还有shared/exclusive locks

Shared/exclusive locks (sx locks) are locks that threads can hold while asleep. As the name implies, multiple threads can have a shared hold on an sx lock, but only one thread can have an exclusive hold on an sx lock. When a thread has an exclusive hold on an sx lock, other threads cannot have a shared hold on that lock.

sx locks do not support priority propagation and are inefficient com- pared to mutexes. The main reason for using sx locks is that threads can sleep while holding one.

reader/writer locks

Reader/writer locks (rw locks) are basically mutexes with sx lock semantics. Like sx locks, threads can hold rw locks as a reader, which is identical to a shared hold, or as a writer, which is identical to an exclusive hold. Like mutexes, rw locks support priority propagation and threads cannot hold them while sleeping (or the kernel will panic).

rw locks are used when you need to protect an object that is mostly going to be read from instead of written to.

shared/exclusive locksreader/writer locks语义类似,但有以下区别:拥有shared/exclusive locks的线程可以sleep,但不支持priority propagation;拥有reader/writer locks的线程不可以sleep,但支持priority propagation

FreeBSD kernel 笔记(9)——modeventtype_t定义

modeventtype_t定义如下:

typedef enum modeventtype {
    MOD_LOAD,
    MOD_UNLOAD,
    MOD_SHUTDOWN,
    MOD_QUIESCE
} modeventtype_t;
typedef int (*modeventhand_t)(module_t, int /* modeventtype_t */, void *);

MOD_LOADMOD_UNLOADMOD_SHUTDOWN都好理解。分别是在加载,卸载模块,还有关机时传入模块处理函数的值。而关于MOD_QUIESCE可以参考FreeBSD Device Drivers

When one issues the kldunload(8) command, MOD_QUIESCE is run before MOD_UNLOAD . If MOD_QUIESCE returns an error, MOD_UNLOAD does not get executed. In other words, MOD_QUIESCE verifies that it is safe to unload your module.

NOTE The kldunload -f command ignores every error returned by MOD_QUIESCE . So you can always unload a module, but it may not be the best idea.

另外,关于MOD_QUIESCEMOD_SHUTDOWN区别,也可参考FreeBSD Kernel Developer’s Manual

The difference between MOD_QUIESCE and MOD_UNLOAD is that the module should fail MOD_QUIESCE if it is currently in use, whereas MOD_UNLOAD should only fail if it is impossible to unload the module, for instance because there are memory references to the module which cannot be revoked.

FreeBSD kernel 笔记(8)——双向链表

FreeBSD kernel提供了对双向链表的支持(定义在sys/sys/queue.h中):

/*
 * List declarations.
 */
#define LIST_HEAD(name, type)                       \
struct name {                               \
    struct type *lh_first;  /* first element */         \
}

#define LIST_CLASS_HEAD(name, type)                 \
struct name {                               \
    class type *lh_first;   /* first element */         \
}

#define LIST_HEAD_INITIALIZER(head)                 \
    { NULL }

#define LIST_ENTRY(type)                        \
struct {                                \
    struct type *le_next;   /* next element */          \
    struct type **le_prev;  /* address of previous next element */  \
}

#define LIST_CLASS_ENTRY(type)                      \
struct {                                \
    class type *le_next;    /* next element */          \
    class type **le_prev;   /* address of previous next element */  \
}

#define LIST_EMPTY(head)    ((head)->lh_first == NULL)

#define LIST_FIRST(head)    ((head)->lh_first)

#define LIST_FOREACH(var, head, field)                  \
    for ((var) = LIST_FIRST((head));                \
        (var);                          \
        (var) = LIST_NEXT((var), field))

#define LIST_NEXT(elm, field)   ((elm)->field.le_next)

#define LIST_INSERT_HEAD(head, elm, field) do {             \
    QMD_LIST_CHECK_HEAD((head), field);             \
    if ((LIST_NEXT((elm), field) = LIST_FIRST((head))) != NULL) \
        LIST_FIRST((head))->field.le_prev = &LIST_NEXT((elm), field);\
    LIST_FIRST((head)) = (elm);                 \
    (elm)->field.le_prev = &LIST_FIRST((head));         \
} while (0)

......

FreeBSD Device Drivers代码为例:

(1)race_softc结构体定义:

struct race_softc {
    LIST_ENTRY(race_softc) list;
    int unit;
};

展开以后变成如下代码:

struct race_softc {
    struct { \
        struct race_softc *le_next; /* next element */          \
        struct race_softc **le_prev;    /* address of previous next element */  \
    } list;
    int unit;
};

(2)双向链表头定义:

static LIST_HEAD(, race_softc) race_list = LIST_HEAD_INITIALIZER(&race_list);

展开以后变成如下代码:

struct {struct race_softc *lh_first;} race_list = {NULL};

(3)插入一个元素:

sc = (struct race_softc *)malloc(sizeof(struct race_softc), M_RACE, M_WAITOK | M_ZERO);
sc->unit = unit;    
LIST_INSERT_HEAD(&race_list, sc, list);

展开以后变成如下代码:

sc = (struct race_softc *)malloc(sizeof(struct race_softc), M_RACE, M_WAITOK | M_ZERO);
sc->unit = unit;
do {                \
    QMD_LIST_CHECK_HEAD((race_list), list);             \
    if ((LIST_NEXT((sc), list) = LIST_FIRST((race_list))) != NULL)  \
        LIST_FIRST((race_list))->list.le_prev = &LIST_NEXT((sc), list);\
    LIST_FIRST((race_list)) = (sc);                 \
    (sc)->list.le_prev = &LIST_FIRST((race_list));          \
} while (0)

展开以后变成如下代码:

do { 
    if (((((sc))->list.le_next) = (((&race_list))->lh_first)) != ((void *)0)) (((&race_list))->lh_first)->list.le_prev = &(((sc))->list.le_next); 
    (((&race_list))->lh_first) = (sc); 
    (sc)->list.le_prev = &(((&race_list))->lh_first); 
} while (0);

即把元素插在链表头部。因为sc位于链表头部,所以其list.le_prev指向它自己。

FreeBSD kernel 笔记(7)——cdevsw结构体中定义不支持操作

下面摘自FreeBSD Device Drivers

If a d_foo function is undefined the corresponding operation is unsupported. However, dopen and dclose are unique; when they’re undefined the kernel will automatically define them as follows:
int
nullop(void)
{
return (0);
}
This ensures that every registered character device can be opened and closed.

即在cdevsw结构体中,d_opend_close是永远不为空的。

/*
 * Character device switch table
 */
struct cdevsw {
    int         d_version;
    u_int           d_flags;
    const char      *d_name;
    d_open_t        *d_open;
    d_fdopen_t      *d_fdopen;
    d_close_t       *d_close;
    d_read_t        *d_read;
    d_write_t       *d_write;
    d_ioctl_t       *d_ioctl;
    d_poll_t        *d_poll;
    d_mmap_t        *d_mmap;
    d_strategy_t        *d_strategy;
    dumper_t        *d_dump;
    d_kqfilter_t        *d_kqfilter;
    d_purge_t       *d_purge;
    d_mmap_single_t     *d_mmap_single;

    int32_t         d_spare0[3];
    void            *d_spare1[3];

    /* These fields should not be messed with by drivers */
    LIST_HEAD(, cdev)   d_devs;
    int         d_spare2;
    union {
        struct cdevsw       *gianttrick;
        SLIST_ENTRY(cdevsw) postfree_list;
    } __d_giant;
};

FreeBSD kernel 笔记(6)——设备通信和控制

FreeBSD系统上,设备通信和控制主要通过sysctlioctl接口:

Generally, sysctls are employed to adjust parameters, and ioctls are used for everything else—that’s why ioctls are the catchall of I/O operations.

ioctl比较简单,不在这里赘述。

要在kernel模块中增加对sysctl的支持,首先要调用sysctl_ctx_init初始化一个sysctl_ctx_list结构体(使用完,通过sysctl_ctx_free来进行释放);然后使用SYSCTL_ADD_*系列函数加入系统支持的参数。需要注意的是,SYSCTL_ADD_*系列函数的第二个参数用来指定新加入参数属于哪个parent node,可以使用下面两个macro来指定其位置:SYSCTL_STATIC_CHILDRENSYSCTL_CHILDREN(如果SYSCTL_STATIC_CHILDREN没有参数,则会新增加一个系统的top-level category)。

另外,SYSCTL_ADD_PROC会增加一个处理函数。其参数是SYSCTL_HANDLER_ARGS

#define SYSCTL_HANDLER_ARGS struct sysctl_oid *oidp, void *arg1,    \
    intptr_t arg2, struct sysctl_req *req

arg1指向sysctl命令需要处理的数据,arg2指向数据的长度。

参考资料:
FreeBSD Device Drivers

Freecolor工具简介

FreeBSD没有free命令,但是提供了一个freecolor命令,用来查看还有多少可用的内存。Freecolor官网在这里,其代码逻辑比较简单。如果系统提供/proc/meminfo,就直接从这个虚拟文件系统获取信息,否则就透过libstatgrab这个第三方库。因为FreeBSD默认不会挂载/proc文件系统,所以会使用libstatgrab

Libstatgrab获得FreeBSD系统内存使用状况的代码:

#elif defined(FREEBSD) || defined(DFBSD)
    /*returns pages*/
    size = sizeof(total_count);
    if (sysctlbyname("vm.stats.vm.v_page_count", &total_count, &size, NULL, 0) < 0) {
        RETURN_WITH_SET_ERROR_WITH_ERRNO("mem", SG_ERROR_SYSCTLBYNAME, "vm.stats.vm.v_page_count");
    }

    /*returns pages*/
    size = sizeof(free_count);
    if (sysctlbyname("vm.stats.vm.v_free_count", &free_count, &size, NULL, 0) < 0) {
        RETURN_WITH_SET_ERROR_WITH_ERRNO("mem", SG_ERROR_SYSCTLBYNAME, "vm.stats.vm.v_free_count");
    }

    size = sizeof(inactive_count);
    if (sysctlbyname("vm.stats.vm.v_inactive_count", &inactive_count , &size, NULL, 0) < 0) {
        RETURN_WITH_SET_ERROR_WITH_ERRNO("mem", SG_ERROR_SYSCTLBYNAME, "vm.stats.vm.v_inactive_count");
    }

    size = sizeof(cache_count);
    if (sysctlbyname("vm.stats.vm.v_cache_count", &cache_count, &size, NULL, 0) < 0) {
        RETURN_WITH_SET_ERROR_WITH_ERRNO("mem", SG_ERROR_SYSCTLBYNAME, "vm.stats.vm.v_cache_count");
    }

    /* Of couse nothing is ever that simple :) And I have inactive pages to
     * deal with too. So I'm going to add them to free memory :)
     */
    mem_stats_buf->cache = (size_t)cache_count;
    mem_stats_buf->cache *= (size_t)sys_page_size;
    mem_stats_buf->total = (size_t)total_count;
    mem_stats_buf->total *= (size_t)sys_page_size;
    mem_stats_buf->free = (size_t)free_count + inactive_count + cache_count;
    mem_stats_buf->free *= (size_t)sys_page_size;
    mem_stats_buf->used = mem_stats_buf->total - mem_stats_buf->free;
#elif defined(WIN32)

可以看到,所有free_countinactive_countcache_count都算作free,即可用的内存。

而获得swap使用率则通过kvm接口:

#elif defined(ALLBSD)
    /* XXX probably not mt-safe! */
    kvmd = kvm_openfiles(NULL, NULL, NULL, O_RDONLY, NULL);
    if(kvmd == NULL) {
        RETURN_WITH_SET_ERROR("swap", SG_ERROR_KVM_OPENFILES, NULL);
    }

    if ((kvm_getswapinfo(kvmd, &swapinfo, 1,0)) == -1) {
        kvm_close( kvmd );
        RETURN_WITH_SET_ERROR("swap", SG_ERROR_KVM_GETSWAPINFO, NULL);
    }

    swap_stats_buf->total = (long long)swapinfo.ksw_total;
    swap_stats_buf->used = (long long)swapinfo.ksw_used;
    kvm_close( kvmd );

    swap_stats_buf->total *= sys_page_size;
    swap_stats_buf->used *= sys_page_size;
    swap_stats_buf->free = swap_stats_buf->total - swap_stats_buf->used;
#elif defined(WIN32)