A brief introduction of zypper

After installing SUSE (my version is SLES 12), the CD/DVD will be added as a default repository:

linux-uibj:~ # zypper repos -d
# | Alias       | Name        | Enabled | Refresh | Priority | Type  | URI                                                         | Service
--+-------------+-------------+---------+---------+----------+-------+-------------------------------------------------------------+--------
1 | SLES12-12-0 | SLES12-12-0 | Yes     | No      |   99     | yast2 | cd:///?devices=/dev/disk/by-id/ata-VBOX_CD-ROM_VB2-01700376 |       

So after you insert the installation ISO file into CDROM, use zypper in command can install the software:

linux-uibj:~ # zypper se systemtap
Loading repository data...
Reading installed packages...

S | Name              | Summary                              | Type
--+-------------------+--------------------------------------+-----------
  | systemtap         | Instrumentation System               | package
  | systemtap         | Instrumentation System               | srcpackage
  | systemtap-docs    | Documents and examples for systemtap | package
  | systemtap-docs    | Documents and examples for systemtap | srcpackage
  | systemtap-runtime | Runtime environment for systemtap    | package
  | systemtap-server  | Systemtap server                     | package
linux-uibj:~ # zypper in systemtap
Loading repository data...
Reading installed packages...
Resolving package dependencies...

The following 3 NEW packages are going to be installed:
  libebl1 systemtap systemtap-runtime

3 new packages to install.
Overall download size: 1.4 MiB. Already cached: 0 B  After the operation, additional 5.7 MiB will be used.
Continue? [y/n/? shows all options] (y):

After removing the repository, even the ISO file is still in CDROM, the zypper in command doesn’t work:

linux-uibj:~ # zypper rr 1
Removing repository 'SLES12-12-0' ...................................................................................................[done]
Repository 'SLES12-12-0' has been removed.
linux-uibj:~ # zypper in systemtap
Warning: No repositories defined. Operating only with the installed resolvables. Nothing can be installed.

You can also add repository’s URL:

linux-uibj:~ # zypper ar http://xxx.net/mrepo/SLE-12-Server-x86_64/disc1/ SLES12-1
Adding repository 'SLES12-1' ........................................................................................................[done]
Repository 'SLES12-1' successfully added
Enabled: Yes
Autorefresh: No
GPG check: Yes
URI: http://xxx.net/mrepo/SLE-12-Server-x86_64/disc1/

linux-uibj:~ # zypper in systemtap
Building repository 'SLES12-1' cache ................................................................................................[done]
Loading repository data...
Reading installed packages...
Resolving package dependencies...

The following 3 NEW packages are going to be installed:
  libebl1 systemtap systemtap-runtime

3 new packages to install.
Overall download size: 1.4 MiB. Already cached: 0 B  After the operation, additional 5.7 MiB will be used.
Continue? [y/n/? shows all options] (y):

Many *-devel packages are in SDK ISO file, so you should also add the SDK ISO into repository.

 

Use SystemTap to track forking process

The SystemTap website provides a forktracker.stp script to track the forking process flow, and the original script is like this (P.S.: now, the script has been modified):

probe kprocess.create
{
  printf("%-25s: %s (%d) created %d\n",
         ctime(gettimeofday_s()), execname(), pid(), new_pid)
}

probe kprocess.exec
{
  printf("%-25s: %s (%d) is exec'ing %s\n",
         ctime(gettimeofday_s()), execname(), pid(), filename)
}

After executing it, the output confused me:

......
Thu Oct 22 05:09:42 2015 : virt-manager (8713) created 8713
Thu Oct 22 05:09:42 2015 : virt-manager (8713) created 8713
Thu Oct 22 05:09:42 2015 : virt-manager (8713) created 8713
Thu Oct 22 05:09:43 2015 : virt-manager (8713) created 8713
......

Why the father and children processes had the same process ID: 8713. At first, I thought it was because the speciality of fork: call once, return twice. So I wrote a simple program to test whether it was due to fork:

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>

int main(void)
{
    pid_t pid;

    pid = fork();
    if (pid < 0) {
        exit(1);
    } else if (pid > 0) {
        printf("Parent exits!\n");
        exit(0);
    }

    printf("hello world\n");
    return 0;
}   

This time, the script outputed the following:

......
Thu Oct 22 05:27:10 2015 : bash (3855) created 8955
Thu Oct 22 05:27:10 2015 : bash (8955) is exec'ing "./test"
Thu Oct 22 05:27:10 2015 : test (8955) created 8956
......

The father and child had different process IDs, so it is not fork system call’s fault.

After resorting to SystemTap mailing list, Josh Stone gave me the answer, and it is related to Linux thread implementation: In Linux, the thread is actually also a process, so for a multi-thread program in Linux, you can think it as a thread-group. The whole thread-group has a thread-group-id(In SystemTap, pid() and new_pid‘ value), and every thread has a unique ID (In SystemTap, tid() and new_tid‘ value). Josh Stone also modified the script like this:

probe kprocess.create {
  printf("%-25s: %s (%d:%d) created %d:%d\n",
         ctime(gettimeofday_s()), execname(), pid(), tid(), new_pid, new_tid)
}

probe kprocess.exec {
  printf("%-25s: %s (%d) is exec'ing %s\n",
         ctime(gettimeofday_s()), execname(), pid(), filename)
}

To verify it, I wrote a multi-thread program:

#include <stdio.h>
#include <pthread.h>
void *thread_func(void *p_arg)
{
        while (1)
        {
                printf("%s\n", (char*)p_arg);
                sleep(10);
        }
}
int main(void)
{
        pthread_t t1, t2;

        pthread_create(&t1, NULL, thread_func, "Thread 1");
        pthread_create(&t2, NULL, thread_func, "Thread 2");

        sleep(1000);
        return;
}

The script output was like this:

......
Sat Oct 24 10:56:35 2015 : bash (889) is exec'ing "./test"
Sat Oct 24 10:56:35 2015 : test (889:889) created 889:890
Sat Oct 24 10:56:35 2015 : test (889:889) created 889:891
......

From the output, we can see: the main thread had the same tid() and pid() value: 889. All the threads had the same pid: 889, but every thread had unique tid values: 889, 890, 891.

Reference:
How to understand the pid() and new_pid are same value in executing forktracker.stp?.

 

Install SystemTap on Suse

The Suse is SLES(Suse Linux Enterprise Server) version.

(1) Install C/C++ Compiler and Tools:

Capture4-667x500

(2) Install SystemTap tools:

# zypper in systemtap*
......

(3) Install kernel debug info packages:

/mnt/suse/x86_64 # ls | grep kernel
kernel-default-base-debuginfo-3.12.49-3.1.x86_64.rpm
kernel-default-debuginfo-3.12.49-3.1.x86_64.rpm
kernel-default-debugsource-3.12.49-3.1.x86_64.rpm
kernel-xen-base-debuginfo-3.12.49-3.1.x86_64.rpm
kernel-xen-debuginfo-3.12.49-3.1.x86_64.rpm
kernel-xen-debugsource-3.12.49-3.1.x86_64.rpm
kernelshark-debuginfo-2.0.4-3.95.x86_64.rpm
nfs-kernel-server-debuginfo-1.3.0-13.1.x86_64.rpm
/mnt/suse/x86_64 # rpm -ivh kernel*
Preparing...                          ################################# [100%]
......

You can also use zypper in kernel-*-debug*.

(4) Test:

/mnt/suse/x86_64 # stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}'
Pass 1: parsed user script and 102 library script(s) using 78240virt/28440res/2708shr/26436data kb, in 160usr/20sys/184real ms.
Pass 2: analyzed script: 1 probe(s), 1 function(s), 3 embed(s), 0 global(s) using 175768virt/126996res/3688shr/123964data kb, in 1650usr/250sys/1902real ms.
Pass 3: using cached /root/.systemtap/cache/38/stap_38af4dc0b3509fcb42d451417e95bbab_1375.c
Pass 4: using cached /root/.systemtap/cache/38/stap_38af4dc0b3509fcb42d451417e95bbab_1375.ko
Pass 5: starting run.
read performed
Pass 5: run completed in 20usr/290sys/638real ms.

All is OK!

elfutils-libelf isn’t libelf!

These days, I am suffering with building ktap on Suse. Since Suse doesn’t provide elfutils-libelf-devel nor libelf-devel, so I decide to build libelf from source code.

First, I install the newest version (0.8.13) from libelf site. Building ktap generate the following errors:

cc  -Wall -O2  -o userspace/kp_main.o -c userspace/kp_main.c
In file included from /usr/local/include/libelf/libelf.h:25:0,
                 from /usr/local/include/libelf/gelf.h:28,
                 from /usr/local/include/gelf.h:1,
                 from userspace/kp_symbol.h:28,
                 from userspace/kp_main.c:40:
/usr/lib64/gcc/x86_64-suse-linux/4.8/include/stddef.h:147:26: error: conflicting types for ‘ptrdiff_t’
 typedef __PTRDIFF_TYPE__ ptrdiff_t;
                          ^
In file included from userspace/kp_main.c:37:0:
userspace/../include/ktap_types.h:20:13: note: previous declaration of ‘ptrdiff_t’ was here
 typedef int ptrdiff_t;
             ^
Makefile:119: recipe for target 'userspace/kp_main.o' failed
make: *** [userspace/kp_main.o] Error 1

The root cause is in this version, libelf.h contains stddef.h which defines ptrdiff_t, while ktap_types.h also defines it. So I decide to use an older version of libelf.

Second, I install the 0.8.5 version of libelf. This time, the compilation generates the following errors:

userspace/kp_symbol.c: In function ‘find_load_address’:
userspace/kp_symbol.c:69:2: warning: implicit declaration of function ‘elf_getphdrnum’ [-Wimplicit-function-declaration]
  if (elf_getphdrnum(elf, &phdrnum))
  ^
userspace/kp_symbol.c: At top level:
userspace/kp_symbol.c:189:44: error: unknown type name ‘GElf_Nhdr’
 static const char *sdt_note_name(Elf *elf, GElf_Nhdr *nhdr, const char *data)
                                            ^
userspace/kp_symbol.c: In function ‘dso_sdt_notes’:
userspace/kp_symbol.c:215:2: error: unknown type name ‘GElf_Nhdr’
  GElf_Nhdr nhdr;
  ^
userspace/kp_symbol.c:222:2: warning: implicit declaration of function ‘elf_getshdrstrndx’ [-Wimplicit-function-declaration]
  if (elf_getshdrstrndx(elf, &shstrndx) != 0)
  ^
userspace/kp_symbol.c:238:3: warning: implicit declaration of function ‘gelf_getnote’ [-Wimplicit-function-declaration]
   (next = gelf_getnote(data, offset, &nhdr, &name_off, &desc_off)) > 0;
   ^
userspace/kp_symbol.c:243:11: error: request for member ‘n_namesz’ in something not a structure or union
   if (nhdr.n_namesz != sizeof(SDT_NOTE_NAME) ||
           ^
userspace/kp_symbol.c:248:3: warning: implicit declaration of function ‘sdt_note_name’ [-Wimplicit-function-declaration]
   name = sdt_note_name(elf, &nhdr, sdt_note_data(data, desc_off));
   ^
userspace/kp_symbol.c:248:8: warning: assignment makes pointer from integer without a cast [enabled by default]
   name = sdt_note_name(elf, &nhdr, sdt_note_data(data, desc_off));
        ^
userspace/kp_symbol.c:253:10: error: request for member ‘n_descsz’ in something not a structure or union
      nhdr.n_descsz, nhdr.n_type);
          ^
userspace/kp_symbol.c:253:25: error: request for member ‘n_type’ in something not a structure or union
      nhdr.n_descsz, nhdr.n_type);
                         ^
Makefile:134: recipe for target 'userspace/kp_symbol.o' failed
make: *** [userspace/kp_symbol.o] Error 1

After googling, I find this issue: elfutils-libelf isn’t libelf! So I install elfutils-libelf from source code (please refer this post), then the build of ktap is OK!