debug | Nan Xiao's Blog

The gotcha of logging gdb output

By default, gdb‘s output file is appended, not overwrote. E.g: debug the same program for 2 times:

$ gdb foo
......
(gdb) set logging on
Copying output to gdb.txt.
Copying debug output to gdb.txt.
(gdb) r
......
$ ll gdb.txt
-rw-rw-r-- 1 nanxiao nanxiao 1067 Jul  9 18:06 gdb.txt
$ gdb foo
......
(gdb) set logging on
Copying output to gdb.txt.
Copying debug output to gdb.txt.
(gdb) r
......
$ ll gdb.txt
-rw-rw-r-- 1 nanxiao nanxiao 2134 Jul  9 18:08 gdb.txt

After second debug, the gdb.txt‘s size is doubled. To overwrite the output file, execute set logging overwrite on before set logging on:

$ gdb foo
......
(gdb) set logging overwrite on
(gdb) set logging on
Copying output to gdb.txt.
Copying debug output to gdb.txt.
(gdb) r
......
$ ll gdb.txt
-rw-rw-r-- 1 nanxiao nanxiao 1067 Jul  9 18:10 gdb.txt

A trick of setting breakpoint in pdb

When using pdb to debug a python program:

python -m pdb foo.py

I want to set a breakpoint, but meet following error:

(Pdb) b bar.py:46
*** 'bar.py' not found from sys.path

A small trick is setting breakpoint in main first and run the program:

(Pdb) b main
Breakpoint 1 at ......
(Pdb) r
......

After breakpoint set for main is hit, set breakpoint again at bar.py:46. This time it should work:

(Pdb) b bar.py:46
Breakpoint 2 at ......

Use gdb’s convenience functions

Today I tried to set a conditional breakpoint in my program when a string variable is assigned one specific value:

b foo.c:488 if (int)strcmp(foo, "foo") == 0

But unfortunately, the gdb will exit with following error:

Unable to restore previously selected frame:
Selected thread is running.
terminate called after throwing an instance of 'gdb_exception_error'
Aborted

After checking in stackoverflow, I found gdb‘s convenience functions. So using $_streq instead of strcmp:

b foo.c:488 if $_streq(foo, "foo")

The gdb works like a charm!

An issue related to uninitialised memory

Today I met an interesting bug: A C program behaved differently between debug (gcc -O0) and release (gcc -O3) modes.

First of all, I compared the logs between two modes, and pinned down in which function, the logs began to diverge.

Secondly, I used gdb to debug two programs simultaneously, and checked the variables’ values, then found a variable which had disparate values that would cause two programs enter different branches in a if-else statement. Hmm, this was the root cause.

My gut feeling was the release mode program may fetch the staled value, but after reviewing code carefully, I found the reason is one block memory (the variable belonged to) allocated from heap was not initialised, so this will introduce notorious “undefined behaviour”.

As far as I know, the reasons for uninitialising variables:
(1) The programmer forgets;
(2) The programmer reckons the variable will be assigned correct value before use, and there may be performance penalty for initialising a block of memory.
Anyway, the lesson I learnt today is unless you are 100% sure it will be OK to uninitialise the specified variable, otherwise please initialise it, and this can save you several hours in the future.

Use libunwind to debug memory leak issue

In our project, there is a shared object with a reference counter, which will be increased if others acquire it and decreased if released. Once the reference counter is 0, the shared object can be reaped. Then we found the classical memory leak issue, i.e., the memory of shared object keeps growing. To debug this issue, I used libunwind.

The principle is simple: print the stack traces of every increment/decrement operations. I borrowed code from Programmatic access to the call stack in C++, and did some tweaks: mostly format the stack traces and output to file. The output is like this:

$ cat /tmp/backtrace.log
0x55ad59ec2556: (foo+0x9)
0x55ad59ec2562: (bar+0x9)
0x55ad59ec2579: (main+0x14)
0x7f941161ee0a: (__libc_start_main+0xea)
0x55ad59ec214a: (_start+0x2a)

A quick method to know the specific position in source code is through gdb: attach the program, then use “info line” command:

$ gdb program -p pid
......
(gdb) info line *0x55ad59ec2556
......

P.S., the code can be download here.