SystemTap needs debug-info package when user-space probing

SystemTap needs debug-info package when user-space probing.E.g.:

 # stap -d /bin/ls --ldd -e 'probe process("ls").function("xmalloc") {print_usyms(ubacktrace())}' -c "ls /"
semantic error: while resolving probe point: identifier 'process' at <input>:1:7
        source: probe process("ls").function("xmalloc") {print_usyms(ubacktrace())}
                      ^

semantic error: no match (similar functions: malloc, calloc, realloc, close, mbrtowc)
Pass 2: analysis failed.  [man error::pass2]

Install coreutils-debuginfo package:

# zypper in coreutils-debuginfo
Loading repository data...
Reading installed packages...
Resolving package dependencies...

The following NEW package is going to be installed:
  coreutils-debuginfo

The following package is not supported by its vendor:
  coreutils-debuginfo

1 new package to install.
Overall download size: 2.1 MiB. Already cached: 0 B. After the operation, additional 18.6 MiB will be used.
Continue? [y/n/? shows all options] (y): y
Retrieving package coreutils-debuginfo-8.22-9.1.x86_64                                                (1/1),   2.1 MiB ( 18.6 MiB unpacked)
Retrieving: coreutils-debuginfo-8.22-9.1.x86_64.rpm ...................................................................[done (105.3 KiB/s)]
Checking for file conflicts: ........................................................................................................[done]
(1/1) Installing: coreutils-debuginfo-8.22-9.1 ......................................................................................[done]

Then execute the stap command again:

# stap -d /bin/ls --ldd -e 'probe process("ls").function("xmalloc") {print_usyms(ubacktrace())}' -c "ls /"
bin  boot  dev  etc  home  lib  lib64  lost+found  mnt  opt  proc  root  run  sbin  selinux  srv  sys  tmp  usr  var
 0x4114a0 : xmalloc+0x0/0x20 [/usr/bin/ls]
 0x411674 : xmemdup+0x14/0x30 [/usr/bin/ls]
 0x40ee4a : clone_quoting_options+0x2a/0x40 [/usr/bin/ls]
 0x403828 : main+0xa58/0x2140 [/usr/bin/ls]
 0x7fad37eefb05 : __libc_start_main+0xf5/0x1c0 [/lib64/libc-2.19.so]
 0x404f39 : _start+0x29/0x30 [/usr/bin/ls]
 0x4114a0 : xmalloc+0x0/0x20 [/usr/bin/ls]
 0x411674 : xmemdup+0x14/0x30 [/usr/bin/ls]
 0x40ee4a : clone_quoting_options+0x2a/0x40 [/usr/bin/ls]
 0x403887 : main+0xab7/0x2140 [/usr/bin/ls]
 0x7fad37eefb05 : __libc_start_main+0xf5/0x1c0 [/lib64/libc-2.19.so]
 0x404f39 : _start+0x29/0x30 [/usr/bin/ls]
 0x4114a0 : xmalloc+0x0/0x20 [/usr/bin/ls]
 0x4039e4 : main+0xc14/0x2140 [/usr/bin/ls]
 0x7fad37eefb05 : __libc_start_main+0xf5/0x1c0 [/lib64/libc-2.19.so]
.....

It works now!

 

Use SystemTap to track forking process

The SystemTap website provides a forktracker.stp script to track the forking process flow, and the original script is like this (P.S.: now, the script has been modified):

probe kprocess.create
{
  printf("%-25s: %s (%d) created %d\n",
         ctime(gettimeofday_s()), execname(), pid(), new_pid)
}

probe kprocess.exec
{
  printf("%-25s: %s (%d) is exec'ing %s\n",
         ctime(gettimeofday_s()), execname(), pid(), filename)
}

After executing it, the output confused me:

......
Thu Oct 22 05:09:42 2015 : virt-manager (8713) created 8713
Thu Oct 22 05:09:42 2015 : virt-manager (8713) created 8713
Thu Oct 22 05:09:42 2015 : virt-manager (8713) created 8713
Thu Oct 22 05:09:43 2015 : virt-manager (8713) created 8713
......

Why the father and children processes had the same process ID: 8713. At first, I thought it was because the speciality of fork: call once, return twice. So I wrote a simple program to test whether it was due to fork:

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>

int main(void)
{
    pid_t pid;

    pid = fork();
    if (pid < 0) {
        exit(1);
    } else if (pid > 0) {
        printf("Parent exits!\n");
        exit(0);
    }

    printf("hello world\n");
    return 0;
}   

This time, the script outputed the following:

......
Thu Oct 22 05:27:10 2015 : bash (3855) created 8955
Thu Oct 22 05:27:10 2015 : bash (8955) is exec'ing "./test"
Thu Oct 22 05:27:10 2015 : test (8955) created 8956
......

The father and child had different process IDs, so it is not fork system call’s fault.

After resorting to SystemTap mailing list, Josh Stone gave me the answer, and it is related to Linux thread implementation: In Linux, the thread is actually also a process, so for a multi-thread program in Linux, you can think it as a thread-group. The whole thread-group has a thread-group-id(In SystemTap, pid() and new_pid‘ value), and every thread has a unique ID (In SystemTap, tid() and new_tid‘ value). Josh Stone also modified the script like this:

probe kprocess.create {
  printf("%-25s: %s (%d:%d) created %d:%d\n",
         ctime(gettimeofday_s()), execname(), pid(), tid(), new_pid, new_tid)
}

probe kprocess.exec {
  printf("%-25s: %s (%d) is exec'ing %s\n",
         ctime(gettimeofday_s()), execname(), pid(), filename)
}

To verify it, I wrote a multi-thread program:

#include <stdio.h>
#include <pthread.h>
void *thread_func(void *p_arg)
{
        while (1)
        {
                printf("%s\n", (char*)p_arg);
                sleep(10);
        }
}
int main(void)
{
        pthread_t t1, t2;

        pthread_create(&t1, NULL, thread_func, "Thread 1");
        pthread_create(&t2, NULL, thread_func, "Thread 2");

        sleep(1000);
        return;
}

The script output was like this:

......
Sat Oct 24 10:56:35 2015 : bash (889) is exec'ing "./test"
Sat Oct 24 10:56:35 2015 : test (889:889) created 889:890
Sat Oct 24 10:56:35 2015 : test (889:889) created 889:891
......

From the output, we can see: the main thread had the same tid() and pid() value: 889. All the threads had the same pid: 889, but every thread had unique tid values: 889, 890, 891.

Reference:
How to understand the pid() and new_pid are same value in executing forktracker.stp?.

 

Install SystemTap on Suse

The Suse is SLES(Suse Linux Enterprise Server) version.

(1) Install C/C++ Compiler and Tools:

Capture4-667x500

(2) Install SystemTap tools:

# zypper in systemtap*
......

(3) Install kernel debug info packages:

/mnt/suse/x86_64 # ls | grep kernel
kernel-default-base-debuginfo-3.12.49-3.1.x86_64.rpm
kernel-default-debuginfo-3.12.49-3.1.x86_64.rpm
kernel-default-debugsource-3.12.49-3.1.x86_64.rpm
kernel-xen-base-debuginfo-3.12.49-3.1.x86_64.rpm
kernel-xen-debuginfo-3.12.49-3.1.x86_64.rpm
kernel-xen-debugsource-3.12.49-3.1.x86_64.rpm
kernelshark-debuginfo-2.0.4-3.95.x86_64.rpm
nfs-kernel-server-debuginfo-1.3.0-13.1.x86_64.rpm
/mnt/suse/x86_64 # rpm -ivh kernel*
Preparing...                          ################################# [100%]
......

You can also use zypper in kernel-*-debug*.

(4) Test:

/mnt/suse/x86_64 # stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}'
Pass 1: parsed user script and 102 library script(s) using 78240virt/28440res/2708shr/26436data kb, in 160usr/20sys/184real ms.
Pass 2: analyzed script: 1 probe(s), 1 function(s), 3 embed(s), 0 global(s) using 175768virt/126996res/3688shr/123964data kb, in 1650usr/250sys/1902real ms.
Pass 3: using cached /root/.systemtap/cache/38/stap_38af4dc0b3509fcb42d451417e95bbab_1375.c
Pass 4: using cached /root/.systemtap/cache/38/stap_38af4dc0b3509fcb42d451417e95bbab_1375.ko
Pass 5: starting run.
read performed
Pass 5: run completed in 20usr/290sys/638real ms.

All is OK!