FreeBSD操作系统上获取CPU信息

FreeBSD既没有GNU/Linux操作系统上的/proc/cpuinfo文件,也不提供lscpu命令(其实lscpu也是访问的/proc/cpuinfo文件)。因此在FreeBSD上想了解当前机器的CPU信息,需要费点小周折:

(1)使用sysctl命令:

# sysctl hw.model hw.machine hw.ncpu
hw.model: Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz
hw.machine: amd64
hw.ncpu: 2

(2)读取/var/run/dmesg.boot文件:

# grep -i cpu /var/run/dmesg.boot
CPU: Intel(R) Core(TM)2 CPU          6600  @ 2.40GHz (2400.05-MHz K8-class CPU)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
cpu0: <ACPI CPU> on acpi0
cpu1: <ACPI CPU> on acpi0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
SMP: AP CPU #1 Launched!

(3)通过dmidecode命令获得CPUcache信息:

# dmidecode -t processor -t cache
# dmidecode 3.0
Scanning /dev/mem for entry point.
SMBIOS 2.4 present.

Handle 0x0004, DMI type 4, 35 bytes
Processor Information
        Socket Designation: LGA 775
        Type: Central Processor
        Family: Pentium 4
        Manufacturer: Intel
        ID: F6 06 00 00 FF FB EB BF
        Signature: Type 0, Family 6, Model 15, Stepping 6
        Flags:
                FPU (Floating-point unit on-chip)
                VME (Virtual mode extension)
                DE (Debugging extension)
                PSE (Page size extension)
......
Handle 0x0005, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L1-Cache
        Configuration: Enabled, Not Socketed, Level 1
        Operational Mode: Write Back
        Location: Internal
        Installed Size: 32 kB
        Maximum Size: 32 kB
......

参考资料:
FreeBSD CPU Information Command
What is the equivalent of /proc/cpuinfo on FreeBSD v8.1?

Perf笔记(七)——perf trace

perf trace有类似于strace功能,可以实时监控程序的系统调用:

# perf trace ./a.out
 0.032 ( 0.002 ms): a.out/7673 brk(                                                                  ) = 0x1e6b000
 0.051 ( 0.005 ms): a.out/7673 access(filename: 0xb7c1cb00, mode: R                                  ) = -1 ENOENT No such file or directory
 0.063 ( 0.005 ms): a.out/7673 open(filename: 0xb7c1a7b7, flags: CLOEXEC                             ) = 3
 0.070 ( 0.002 ms): a.out/7673 fstat(fd: 3, statbuf: 0x7ffffb72bc80                                  ) = 0
 0.073 ( 0.004 ms): a.out/7673 mmap(len: 38436, prot: READ, flags: PRIVATE, fd: 3                    ) = 0x7f18b7e15000
 0.079 ( 0.001 ms): a.out/7673 close(fd: 3                                                           ) = 0
 0.087 ( 0.005 ms): a.out/7673 open(filename: 0xb7e21ec0, flags: CLOEXEC                             ) = 3
 0.093 ( 0.003 ms): a.out/7673 read(fd: 3, buf: 0x7ffffb72be28, count: 832                           ) = 832
 0.099 ( 0.002 ms): a.out/7673 fstat(fd: 3, statbuf: 0x7ffffb72bcc0                                  ) = 0
 0.102 ( 0.003 ms): a.out/7673 mmap(len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1    ) = 0x7f18b7e13000
 0.110 ( 0.004 ms): a.out/7673 mmap(len: 2283024, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3   ) = 0x7f18b79cf000
 0.116 ( 0.007 ms): a.out/7673 mprotect(start: 0x7f18b79fc000, len: 2093056                          ) = 0
 0.125 ( 0.005 ms): a.out/7673 mmap(addr: 0x7f18b7bfb000, len: 8192, prot: READ|WRITE, flags: PRIVATE|DENYWRITE|FIXED, fd: 3, off: 180224) = 0x7f18b7bfb000
 0.142 ( 0.002 ms): a.out/7673 close(fd: 3                                                           ) = 0
 0.153 ( 0.006 ms): a.out/7673 open(filename: 0xb7e134c0, flags: CLOEXEC                             ) = 3
 0.161 ( 0.003 ms): a.out/7673 read(fd: 3, buf: 0x7ffffb72bdf8, count: 832                           ) = 832
 0.165 ( 0.002 ms): a.out/7673 fstat(fd: 3, statbuf: 0x7ffffb72bc90                                  ) = 0
 0.169 ( 0.005 ms): a.out/7673 mmap(len: 2216432, prot: EXEC|READ, flags: PRIVATE|DENYWRITE, fd: 3   ) = 0x7f18b77b1000
......

Perf笔记(四)——profile应用程序

使用perf record命令可以profile应用程序  (编译程序要使用-g,推荐使用-g -O2

perf record program [args]

或者在程序启动以后,使用-p pid选项:

perf record -p pid

默认情况下,信息会存在perf.data文件里,使用perf report命令可以解析这个文件:

Capture

哪些函数占用CPU比较多,一目了然。

另外,在使用perf record时可以加--call-graph dwarf选项:

--call-graph
 Setup and enable call-graph (stack chain/backtrace) recording, implies -g.
 Default is "fp".

采样结果如下:

Capture

关于ChildrenSelf的含义,perf wiki给了一个详尽的解释。以下列代码为例:

void foo(void) {
    /* do something */
}

void bar(void) {
    /* do something */
    foo();
}

int main(void) {
    bar()
    return 0;
}

Self表示函数本身的overhead:如果foo函数的overhead60%,那么barSelf overhead就是40%(刨除foo所占部分)。因为foobar都是main的子函数,所以二者的overhead都要计算入mainChildren overhead

对于采用--call-graph dwarf选项生成的perf.data做出的火焰图如下:

Capture可以看到显示了完整的函数调用栈。