Linux | Nan Xiao's Blog

The anatomy of “Hello World” python program in bcc

The bcc gives a simple Hello World program analysis, but I want to give a more detailed anatomy of it:

from bcc import BPF

BPF(text='int kprobe__sys_clone(void *ctx) { bpf_trace_printk("Hello, World!\\n"); return 0; }').trace_print()

(1)

from bcc import BPF

bcc is a package, and it will be installed in Python‘s site-packages directory. Take my server as an example:

>>> import bcc
>>> bcc.__path__
['/usr/lib/python3.6/site-packages/bcc']

BPF is a class defined in __init__.py file of bcc:

......
class BPF(object):
# From bpf_prog_type in uapi/linux/bpf.h
    SOCKET_FILTER = 1
    KPROBE = 2
......

(2) Divide

BPF(text='int kprobe__sys_clone(void *ctx) { bpf_trace_printk("Hello, World!\\n"); return 0; }').trace_print()

into 2 parts:

BPF(text='int kprobe__sys_clone(void *ctx) { bpf_trace_printk("Hello, World!\\n"); return 0; }')

and

trace_print()

BPF(text='int kprobe__sys_clone(void *ctx) { bpf_trace_printk("Hello, World!\\n"); return 0; }')

This statement will create a BPF object, and the code is simplified as following:

......
from .libbcc import lib, _CB_TYPE, bcc_symbol, _SYM_CB_TYPE 
......
class BPF(object):
......
    def __init__(self, src_file="", hdr_file="", text=None, cb=None, debug=0,
        cflags=[], usdt_contexts=[]):
        ......

        if text:
            self.module = lib.bpf_module_create_c_from_string(text.encode("ascii"),
            self.debug, cflags_array, len(cflags_array))
        else:
            ......

        self._trace_autoload()

Check libbcc.py:

lib = ct.CDLL("libbcc.so.0", use_errno=True)

We can see lib module is indeed libbcc dynamic library, and bpf_module_create_c_from_string is implemented as this:

void * bpf_module_create_c_from_string(const char *text, unsigned flags, const char *cflags[], int ncflags) {
    auto mod = new ebpf::BPFModule(flags);
    if (mod->load_string(text, cflags, ncflags) != 0) {
        delete mod;
        return nullptr;
    }
    return mod;
}

We won’t delve it too much, and just know it returns a ebpf module is enough.

For self._trace_autoload() part, we can check _trace_autoload method definition:

def _trace_autoload(self):
    for i in range(0, lib.bpf_num_functions(self.module)):
        func_name = str(lib.bpf_function_name(self.module, i).decode())
        if func_name.startswith("kprobe__"):
            fn = self.load_func(func_name, BPF.KPROBE)
            self.attach_kprobe(event=fn.name[8:], fn_name=fn.name)
        elif func_name.startswith("kretprobe__"):
            fn = self.load_func(func_name, BPF.KPROBE)
            self.attach_kretprobe(event=fn.name[11:], fn_name=fn.name)
        elif func_name.startswith("tracepoint__"):
            fn = self.load_func(func_name, BPF.TRACEPOINT)
            tp = fn.name[len("tracepoint__"):].replace("__", ":")
            self.attach_tracepoint(tp=tp, fn_name=fn.name)

Based on above code, we can see bcc now supports kprobe, kretprobe and tracepoint. For Hello World example, it actually execute attach_kprobemethod:

def attach_kprobe(self, event="", fn_name="", event_re="",
        pid=-1, cpu=0, group_fd=-1):

    ......

    event = str(event)
    self._check_probe_quota(1)
    fn = self.load_func(fn_name, BPF.KPROBE)
    ev_name = "p_" + event.replace("+", "_").replace(".", "_")
    res = lib.bpf_attach_kprobe(fn.fd, 0, ev_name.encode("ascii"),
            event.encode("ascii"), pid, cpu, group_fd,
            self._reader_cb_impl, ct.cast(id(self), ct.py_object))
    res = ct.cast(res, ct.c_void_p)
    ......
    self._add_kprobe(ev_name, res)
    return self

Finally, it will call libbcc‘s bpf_attach_kprobe function. Until now, the kprobe is attached successfully, and attach_kprobe will return a BPF object.

(b)

BPF(...).trace_print()

The trace_print method is not too hard:

def trace_print(self, fmt=None):
    ......
    try:
        while True:
            if fmt:
                fields = self.trace_fields(nonblocking=False)
                if not fields: continue
                line = fmt.format(*fields)
            else:
                line = self.trace_readline(nonblocking=False)
            print(line)
            sys.stdout.flush()
    except KeyboardInterrupt:
        exit()

For fmt=None case, it just calls trace_readline(), which read from trace pipe, and prints it on the stdout:

print(line)
sys.stdout.flush()

That’s all, and hope you enjoy bcc and ebpf!

Install bcc on ArchLinux

To install bcc on ArchLinux, firstly you need to setup yaourt from AUR:

$ git clone https://aur.archlinux.org/yaourt.git
$ cd yaourt
$ makepkg -si

Then execute yaourt bcc command:

# yaourt bcc
1 aur/bcc 0.3.0-1 [installed] (17) (2.51)
    BPF Compiler Collection - C library and examples
2 aur/bcc-git v0.1.8.r330.52cd371-1 (2) (0.06)
    BPF Compiler Collection - C library and examples
3 aur/bcc-tools 0.3.0-1 [installed] (17) (2.51)
    BPF Compiler Collection - Tools
4 aur/bcc-tools-git v0.1.8.r330.52cd371-1 (2) (0.06)
    BPF Compiler Collection - Tools
......

Select the order number of bcc, bcc-tools, python-bcc and python2-bcc, and install them.

Once finished, the bcc would be installed in the directory of /usr/share/bcc:

# ls
examples  man  tools

To facilitate your daily work, you can add man pages and tools in your .bashrc file:

MANPATH=/usr/share/bcc/man:$MANPATH
PATH=/usr/share/bcc/tools:$PATH

You can also install from source code:

git clone https://github.com/iovisor/bcc.git
mkdir bcc/build; cd bcc/build
cmake .. -DCMAKE_INSTALL_PREFIX=/usr
make
sudo make install

P.S. To run tools shipped in bcc, you need to install kernel header files:

sudo pacman -S linux-headers

Update on 28/10/2020：
Currently, bcc is the official package and split to 3 pieces: bcc, bcc-tools and python-bcc. Please install all 3 packages. If you forgot to install python-bcc, when running commands from bcc-tools, you will come across following errors:

# /usr/share/bcc/tools/execsnoop
Traceback (most recent call last):
  File "/usr/share/bcc/tools/execsnoop", line 21, in <module>
    from bcc import BPF
ModuleNotFoundError: No module named 'bcc'

Use perf and FlameGraph to profile program on Linux

In most Linux environments, the perf tools should be set up by default. Otherwise, you can install it manually. E.g., in ArchLinux:

# pacman -S perf

Use following program as an example (It is a rifacimento from here, and you should only focus on the framework of the code):

# cat test.cpp
#include <NTL/ZZX.h>

using namespace std;
using namespace NTL;

void inner(int i, ZZX& t, Vec<ZZX>& phi)
{
        for (long j = 1; j <= i-1; j++)
         if (i % j == 0)
            t *= phi(j);
}

void outer(int i, Vec<ZZX>& phi)
{
        ZZX t;
        t = 1;
        inner(i, t, phi);
        phi(i) = (ZZX(INIT_MONO, i) - 1)/t;
        cout << phi(i) << "\n";
}

int main()
{
   Vec<ZZX> phi(INIT_SIZE, 100);

   for (long i = 1; i <= phi.length(); i++) {
      outer(i, phi);
   }
}

Compile it:

# g++ -g -O2 -pthread test.cpp -lntl -lgmp

It is suggested that using -g -O2 options since -g can provide debug information which perf needs and -O2 can generate lots of optimizations.

Use perf record to sample the program:

# perf record --call-graph dwarf ./a.out
......
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.318 MB perf.data (38 samples) ]

To profile an already running program, use -p pid flag. A perf.data file will be generated in current directory, and you can use perf report command to parse it:

# perf report

The detailed information of every function will be showed:

Another awesome tool is FlameGraph which is used to analyze stack call traces:

# git clone --depth 1 https://github.com/brendangregg/FlameGraph
# cd FlameGraph

Copy perf.data into current directory:

# cp ../perf.data ./

Execute following command:

# perf script | ./stackcollapse-perf.pl |./flamegraph.pl > perf.svg

The perf.svg is like this:

You can see the whole stack frameworks and functions’ consume time ratio.

P.S., the full code is here.

Use Source Insight as the editor to develop Unix softwares

Source Insight is my favorite editor, and I have used it for more than 10 years. But when employing it to develop Unix software, you will run into annoying line break issue, which is on windows, the newline is \r\n while in Unix it is \n only. Therefore you will see the file edited in Source Insight will display an extra ^M in Unix environment:

#include <stdio.h>^M
^M
int main(void)^M
{^M
        printf("\r\n");^M
}^M

To resolve this problem, you can refer this topic in stackoverflow:

To save a file with a specific end-of-line type in Source Insight, select File -> Save As…, then where it says “Save as type”, select the desired end-of-line type.

To set the end-of-line type for new files you create in Source Insight, select Options -> Preferences and click the Files tab. Where it says “Default file format” select the desired end-of-line type.

So you can set Unix file format as you wanted:

Another caveat you should pay attention is if you use git Windows client, by default, it will convert the newline of project from \n to \r\n directly. My solution is just disabling this auto conversion feature:

git config --global core.autocrlf false

The core dump file in Arch Linux

Arch Linux uses systemd, and the core dump files will be stored in /var/lib/systemd/coredump directory by default.

Let’s look at the following simple program:

int main(void) {
        int *p = 0;
        *p = 1;
}

Compile and run it:

$ gcc -g -o test test.c
$ ./test
Segmentation fault (core dumped)

Definitely, the program crashes! Check the core dump files:

$ coredumpctl list
TIME                            PID   UID   GID SIG COREFILE EXE
......
Mon 2017-01-09 16:13:44 SGT    7307  1014  1014  11 present  /home/xiaonan/test
$ ls -alt /var/lib/systemd/coredump/core.test*
-rw-r-----+ 1 root root     22821 Jan  9 16:13 /var/lib/systemd/coredump/core.test.1014.183b4c57ccad464abd2eba2c104a47a8.7307.1483949624000000000000.lz4

According to the file name and time, we can find the program’s corresponding core dump binary. To debug it, we can use “coredumpctl gdb PID” command:

$ coredumpctl gdb 7307
       PID: 7307 (test)
       UID: 1014 (xiaonan)
       GID: 1014 (xiaonan)
    Signal: 11 (SEGV)
 Timestamp: Mon 2017-01-09 16:13:44 SGT (6min ago)
Command Line: ./test
Executable: /home/xiaonan/test
 Control Group: /user.slice/user-1014.slice/session-c4.scope
          Unit: session-c4.scope
         Slice: user-1014.slice
       Session: c4
     Owner UID: 1014 (xiaonan)
       Boot ID: 183b4c57ccad464abd2eba2c104a47a8
    Machine ID: 25671e5feadb4ae4bbe2c9ee6de97d66
      Hostname: supermicro-sys-1028gq-trt
       Storage: /var/lib/systemd/coredump/core.test.1014.183b4c57ccad464abd2eba2c104a47a8.7307.1483949624000000000000.lz4
       Message: Process 7307 (test) of user 1014 dumped core.

            Stack trace of thread 7307:
            #0  0x00000000004004b6 main (test)
            #1  0x00007f67ba722291 __libc_start_main (libc.so.6)
            #2  0x00000000004003da _start (test)

GNU gdb (GDB) 7.12
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/xiaonan/test...done.
[New LWP 7307]
Core was generated by `./test'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00000000004004b6 in main () at test.c:3
3               *p = 1;
(gdb) bt
#0  0x00000000004004b6 in main () at test.c:3

BTW, if you omit PID, “coredumpctl gdb” will launch the newest core dump file.

Attention should be paid. The core dump size is limited to 2GiB, so if it is too large, it will be truncated, and generate the warning like this (Please refer this bug):

......
BFD: Warning: /var/tmp/coredump-ZrhAM4 is truncated: expected core file size >= 2591764480, found: 2147483648.
......

References:
GDB and trouble with core dumps;
Core dump.