SystemTap 笔记 (6)—— 打印userspace堆栈信息


          Add symbol/unwind information for the given module into the kernel object module.  This  may  enable  symbolic  tracebacks
          from those modules/programs, even if they do not have an explicit probe placed into them.

--ldd  Add symbol/unwind  information  for  all  shared libraries suspected by ldd to be necessary for user-space binaries being
          probe or listed with the -d option.  Caution: this can make the probe modules considerably larger.

-d选项负责加载模块/可执行程序的符号表信息,而-ldd则加载-d modulemodule或是probe需要的共享库符号表信息。参考下例:

 # stap -d /usr/lib/systemd/systemd-udevd --ldd -e 'probe kprocess.create {print_ubacktrace()}'
<no user backtrace at kernel.function("[email protected]/kernel/fork.c:1146").return>
 0x7fec1d14f011 : clone+0x31/0x90 [/lib64/]
 0x7f6feb135011 : clone+0x31/0x90 [/lib64/]
WARNING: Missing unwind data for module, rerun with 'stap -d /usr/lib64/'
 0x7f22c3026011 : clone+0x31/0x90 [/lib64/]
 0x7f22c2ff7ed4 : __fork+0xb4/0x320 [/lib64/]
 0x7f22c3a01c35 [/usr/lib64/]
 0x7f20966a5011 : clone+0x31/0x90 [/lib64/]
 0x7f22c3026011 : clone+0x31/0x90 [/lib64/]
 0x7f20966a5011 : clone+0x31/0x90 [/lib64/]
WARNING: Missing unwind data for module, rerun with 'stap -d /usr/lib/systemd/systemd'
 0x7f4e59945ed4 : __fork+0xb4/0x320 [/lib64/]
 0x4364f3 [/usr/lib/systemd/systemd+0x364f3/0x113000]
 0x7f22c2ff7ed4 : __fork+0xb4/0x320 [/lib64/]
 0x7f22c3a01c35 [/usr/lib64/]
 0x7fb1bdfb6011 : clone+0x31/0x90 [/lib64/]
 0x7f22c3026011 : clone+0x31/0x90 [/lib64/]
 0x7fb1bdfb6011 : clone+0x31/0x90 [/lib64/]
 0x7f3bb6e94011 : clone+0x31/0x90 [/lib64/]
 0x7f3bb6e94011 : clone+0x31/0x90 [/lib64/]
 0x7f783f704ed4 : __fork+0xb4/0x320 [/lib64/]
 0x7f783fd2169b [/usr/lib64/]

Is there any better method to pass “-d OBJECT” options in command line?
User-Space Stack Backtraces



SystemTap 笔记 (5)—— target variable (1)

关于target variable的解释:

The probe events that map to actual locations in the code (for example kernel.function(“function”) and kernel.statement(“statement”)) allow the use of target variables to obtain the value of variables visible at that location in the code. You can use the -L option to list the target variable available at a probe point.

其实,目前更倾向于使用context variable这个名字,而不是target variable(可以参考这封邮件)。使用target variable需要有kerneldebuginfo。参考下面例子:

# stap -L 'kernel.function("vfs_read")'
kernel.function("[email protected]/fs/read_write.c:381") $file:struct file* $buf:char* $count:size_t $pos:loff_t*

每个target variable前面有$:后面跟着变量类型。例如:file变量的类型就是struct file*。也可对照vfs_read的定义:

ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)

此外,对于target variable不属于当前probelocal变量,可以使用@var("[email protected]/file.c")来访问:

When a target variable is not local to the probe point, like a global external variable or a file local static variable defined in another file then it can be referenced through “@var(“[email protected]/file.c”)”.


# stap -e 'probe kernel.function("vfs_read") {
           printf ("current files_stat max_files: %d\n",
                   @var("[email protected]/file_table.c")->max_files);
           exit(); }'
current files_stat max_files: 82002


Obtain the character at address from kernel memory.
Obtain the short at address from kernel memory.
Obtain the int at address from kernel memory.
Obtain the long at address from kernel memory
Obtain the string at address from kernel memory.
kernel_string_n(address, n)
Obtain the string at address from the kernel memory and limits the string to n bytes.


SystemTap 笔记 (4)—— timer event

timer event会周期性执行handler。举个例子:

# stap -e 'probe timer.s(1) { printf("Hello world!\n");}'
Hello world!
Hello world!
Hello world!
Hello world!

上面脚本每隔1秒打印一次Hello world!

timer event定义如下:



The probe handler is run every N jiffies (a kernel-defined unit of time, typically between 1 and 60 ms). If the “randomize” component is given, a linearly distributed random value in the range [-M..+M] is added to N every time the handler is run. N is restricted to a reasonable range (1 to around a million), and M is restricted to be smaller than N.

Alternatively, intervals may be specified in units of time. There are two probe point variants similar to the jiffies timer:

Here, N and M are specified in milliseconds, but the full options for units are seconds (s/sec), milliseconds (ms/msec), microseconds (us/usec), nanoseconds (ns/nsec), and hertz (hz). Randomization is not supported for hertz timers.

最后结合一个例子看一下如何使用timer event(选自这里):

global count_jiffies, count_ms
probe timer.jiffies(100) { count_jiffies ++ }
probe { count_ms ++ }
  hz=(1000*count_jiffies) / count_ms
  printf ("jiffies:ms ratio %d:%d => CONFIG_HZ=%d\n",
    count_jiffies, count_ms, hz)
  exit ()


其次,每发生100jiffiescount_jiffies计数加1,所以脚本退出时,一共发生100 * count_jiffiesHZ。一共经历了count_ms / 10秒。

最后计算CONFIG_HZ(100 * count_jiffies) / (count_ms / 10) = (1000 * count_jiffies) / count_ms


SystemTap 笔记 (2)—— 函数probe



kernel指的是kernle image文件(vmlinux),而module则指“/lib/modules/uname -r”下的模块,即ko文件。


call is used to attach entry point non-inlined function, while .inline is used to attach first instruction of inlined function;

maxactive specifies how many instances of the specified function can be probed simultaneously. You can leave off .maxactive in most cases, as the default (KRETACTIVE) should be sufficient. However, if you notice an excessive number of skipped probes, try setting .maxactive to incrementally higher values to see if the number of skipped probes decreases.

.return is used for return points of non-inlined functions;

empty suffix is treated as combination of .call and .inline suffixes.



stap -l 'kernel.function("*")'列出当前所有kernelfunction probe:

linux: # stap -l 'kernel.function("*")'
kernel.function("[email protected]/security/apparmor/include/policy.h:401")
kernel.function("[email protected]/crypto/sha256_generic.c:48")
kernel.function("[email protected]/fs/eventpoll.c:2051")
kernel.function("[email protected]/fs/notify/fanotify/fanotify_user.c:912")
kernel.function("[email protected]/fs/open.c:205")
kernel.function("[email protected]/kernel/futex_compat.c:174")
kernel.function("[email protected]/kernel/futex_compat.c:135")
kernel.function("[email protected]/kernel/compat.c:293")

stap -l 'module("ahci").function("*")'列出当前所有ahci模块的function probe:

linux: # stap -l 'module("ahci").function("*")'
module("ahci").function("[email protected]/drivers/ata/ahci.h:372")
module("ahci").function("[email protected]/drivers/ata/ahci.c:1024")
module("ahci").function("[email protected]/drivers/ata/ahci.c:940")
module("ahci").function("[email protected]/drivers/ata/ahci.c:905")
module("ahci").function("[email protected]/drivers/ata/ahci.c:700")
module("ahci").function("[email protected]/drivers/ata/ahci.c:1075")
module("ahci").function("[email protected]/drivers/ata/ahci.c:1164")
module("ahci").function("[email protected]/drivers/ata/ahci.c:1122")
module("ahci").function("[email protected]/drivers/ata/ahci.c:1211")
module("ahci").function("[email protected]/drivers/ata/ahci.h:386")
module("ahci").function("[email protected]/drivers/ata/ahci.c:604")
module("ahci").function("[email protected]/drivers/ata/ahci.c:775")
module("ahci").function("[email protected]/drivers/ata/ahci.c:677")
module("ahci").function("[email protected]/drivers/ata/ahci.c:649")


SystemTap 笔记 (1)—— probe定义





A synchronous event occurs when any process executes an instruction at a particular location in kernel code. This gives other events a reference point from which more contextual data may be available.



Asynchronous events are not tied to a particular instruction or location in code. This family of probe points consists mainly of counters, timers, and similar constructs.


SystemTap Scripts