Use SystemTap to track forking process

The SystemTap website provides a forktracker.stp script to track the forking process flow, and the original script is like this (P.S.: now, the script has been modified):

probe kprocess.create
{
  printf("%-25s: %s (%d) created %d\n",
         ctime(gettimeofday_s()), execname(), pid(), new_pid)
}

probe kprocess.exec
{
  printf("%-25s: %s (%d) is exec'ing %s\n",
         ctime(gettimeofday_s()), execname(), pid(), filename)
}

After executing it, the output confused me:

......
Thu Oct 22 05:09:42 2015 : virt-manager (8713) created 8713
Thu Oct 22 05:09:42 2015 : virt-manager (8713) created 8713
Thu Oct 22 05:09:42 2015 : virt-manager (8713) created 8713
Thu Oct 22 05:09:43 2015 : virt-manager (8713) created 8713
......

Why the father and children processes had the same process ID: 8713. At first, I thought it was because the speciality of fork: call once, return twice. So I wrote a simple program to test whether it was due to fork:

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <unistd.h>

int main(void)
{
    pid_t pid;

    pid = fork();
    if (pid < 0) {
        exit(1);
    } else if (pid > 0) {
        printf("Parent exits!\n");
        exit(0);
    }

    printf("hello world\n");
    return 0;
}   

This time, the script outputed the following:

......
Thu Oct 22 05:27:10 2015 : bash (3855) created 8955
Thu Oct 22 05:27:10 2015 : bash (8955) is exec'ing "./test"
Thu Oct 22 05:27:10 2015 : test (8955) created 8956
......

The father and child had different process IDs, so it is not fork system call’s fault.

After resorting to SystemTap mailing list, Josh Stone gave me the answer, and it is related to Linux thread implementation: In Linux, the thread is actually also a process, so for a multi-thread program in Linux, you can think it as a thread-group. The whole thread-group has a thread-group-id(In SystemTap, pid() and new_pid‘ value), and every thread has a unique ID (In SystemTap, tid() and new_tid‘ value). Josh Stone also modified the script like this:

probe kprocess.create {
  printf("%-25s: %s (%d:%d) created %d:%d\n",
         ctime(gettimeofday_s()), execname(), pid(), tid(), new_pid, new_tid)
}

probe kprocess.exec {
  printf("%-25s: %s (%d) is exec'ing %s\n",
         ctime(gettimeofday_s()), execname(), pid(), filename)
}

To verify it, I wrote a multi-thread program:

#include <stdio.h>
#include <pthread.h>
void *thread_func(void *p_arg)
{
        while (1)
        {
                printf("%s\n", (char*)p_arg);
                sleep(10);
        }
}
int main(void)
{
        pthread_t t1, t2;

        pthread_create(&t1, NULL, thread_func, "Thread 1");
        pthread_create(&t2, NULL, thread_func, "Thread 2");

        sleep(1000);
        return;
}

The script output was like this:

......
Sat Oct 24 10:56:35 2015 : bash (889) is exec'ing "./test"
Sat Oct 24 10:56:35 2015 : test (889:889) created 889:890
Sat Oct 24 10:56:35 2015 : test (889:889) created 889:891
......

From the output, we can see: the main thread had the same tid() and pid() value: 889. All the threads had the same pid: 889, but every thread had unique tid values: 889, 890, 891.

Reference:
How to understand the pid() and new_pid are same value in executing forktracker.stp?.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.