The byproducts of reading OpenBSD netcat code

When I took part in a training last year, I heard about netcat for the first time. During that class, the tutor showed some hacks and tricks of using netcat which appealed to me and motivated me to learn the guts of it. Fortunately, in the past 2 months, I was not so busy that I can spend my spare time to dive into OpenBSD‘s netcat source code, and got abundant byproducts during this process.

(1) Brush up socket programming. I wrote my first network application more than 10 years ago, and always think the socket APIs are marvelous. Just ~10 functions (socket, bind, listen, accept…) with some IO multiplexing buddies (select, poll, epoll…) connect the whole world, wonderful! From that time, I developed a habit that is when touching a new programming language, network programming is an essential exercise. Even though I don’t write socket related code now, reading netcat socket code indeed refresh my knowledge and teach me new stuff.

(2) Write a tutorial about netcat. I am mediocre programmer and will forget things when I don’t use it for a long time. So I just take notes of what I think is useful. IMHO, this “tutorial” doesn’t really mean teach others something, but just a journal which I can refer when I need in the future.

(3) Submit patches to netcat. During reading code, I also found bugs and some enhancements. Though trivial contributions toOpenBSD, I am still happy and enjoy it.

(4) Implement a C++ encapsulation of libtls. OpenBSD‘s netcat supports tls/ssl connection, but it needs you take full care of resource management (memory, socket, etc), otherwise a small mistake can lead to resource leak which is fatal for long-live applications (In fact, the two bugs I reported to OpenBSD are all related resource leak). Therefore I develop a simple C++ library which wraps the libtls and hope it can free developer from this troublesome problem and put more energy in application logic part.

Long story to short, reading classical source code is a rewarding process, and you can consider to try it yourself.

Forgetting “-pthread” option may give you a big surprise!

Today, I wrote a small pthread program to do some testing:

#include <pthread.h>

int main(void)
{
        pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
        pthread_cond_t cv = PTHREAD_COND_INITIALIZER;

        pthread_mutex_lock(&mutex);
        pthread_cond_wait(&cv, &mutex);
        return 0;
} 

Build and test it on OpenBSD-current (version is 6.4):

# cc cv_test.c -o cv_test
# ./cv_test

The program will block there and it is my expected result. Switch to Arch Linux (kernel version is 4.18.9):

# cc cv_test.c -o cv_test
# ./cv_test
#

The program will exit immediately. I doubt it is “spurious awake” firstly, but can’t get a convincing explanation. Using ldd to check program. On OpenBSD:

# ldd cv_test
cv_test:
        Start            End              Type  Open Ref GrpRef Name
        000000d4c3a00000 000000d4c3c02000 exe   1    0   0      cv_test
        000000d6e6007000 000000d6e62f6000 rlib  0    1   0      /usr/lib/libc.so.92.5
        000000d6db100000 000000d6db100000 ld.so 0    1   0      /usr/libexec/ld.so

On Arch Linux:

# ldd cv_test
        linux-vdso.so.1 (0x00007ffde91c6000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007f3e3169b000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f3e3187a000)

Nothing special. After seeking help on stackoverflow, the answer is I need adding -pthread option:

# cc -pthread cv_test.c -o cv_test
# ./cv_test

This time it worked perfectly. Checking linked library:

# ldd cv_test
        linux-vdso.so.1 (0x00007fff48be8000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fa46f84c000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007fa46f688000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fa46f888000)

Why doesn’t Linux give me a link error which prompts I need link libpthread? It seems not make sense.

A brief introduction of OpenBSD nohup command

When you execute command in terminal (not background mode), if the connection disconnects unexpectedly, the running process will be terminated by SIGHUP signal. nohup command can let process still keep running when this situation occurs.

OpenBSD‘s nohup implementation is neat. It actually only does 4 things:

(1) If stdout is terminal, redirect it to nohup.out file (created in current directory or specified by HOME environment variable):

......
if (isatty(STDOUT_FILENO))
    dofile();
......

In dofile option:

......
if (dup2(fd, STDOUT_FILENO) == -1)
    err(EXIT_MISC, NULL);
......

(2) If stderr is terminal, redirect it to stdout. In this case, stderr and stdout will point to same file:

if (isatty(STDERR_FILENO) && dup2(STDOUT_FILENO, STDERR_FILENO) == -1) {
    ......
}

(3) Ignore SIGHUP signal:

......
(void)signal(SIGHUP, SIG_IGN);
......

(4) Execute the intended command:

execvp(argv[1], &argv[1]);

That’s all!

OpenBSD saves me again! — Debug a memory corruption issue

Yesterday, I came across a third-part library issue, which crashes at allocating memory:

......
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f594a5a9b6b in _int_malloc () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007f594a5a9b6b in _int_malloc () from /usr/lib/libc.so.6
#1  0x00007f594a5ab503 in malloc () from /usr/lib/libc.so.6
#2  0x00007f594b13f159 in operator new (sz=5767168) at /build/gcc/src/gcc/libstdc++-v3/libsupc++/new_op.cc:50
......

It is obvious that the memory tags are corrupted, but who is the murder? Since the library involves a lot of maths computation, it is not an easy task to grasp the code quickly. So I need to find another way:

(1) Open all warnings during compilation: -Wall. Nothing found.

(2) Use valgrind, but unfortunately, valgrind crashes itself:

......
valgrind: the 'impossible' happened:
   Killed by fatal signal

host stacktrace:
==43326==    at 0x58053139: get_bszB_as_is (m_mallocfree.c:303)
==43326==    by 0x58053139: get_bszB (m_mallocfree.c:315)
==43326==    by 0x58053139: vgPlain_arena_malloc (m_mallocfree.c:1799)
==43326==    by 0x5800BA84: vgMemCheck_new_block (mc_malloc_wrappers.c:372)
==43326==    by 0x5800BD39: vgMemCheck___builtin_vec_new (mc_malloc_wrappers.c:427)
==43326==    by 0x5809F785: do_client_request (scheduler.c:1866)
==43326==    by 0x5809F785: vgPlain_scheduler (scheduler.c:1433)
==43326==    by 0x580AED50: thread_wrapper (syswrap-linux.c:103)
==43326==    by 0x580AED50: run_a_thread_NORETURN (syswrap-linux.c:156)

sched status:
  running_tid=1
......

(3) Change compiler, use clang instead of gcc, and hope it can give me some clues. Still no effect.

(4) Switch Operating System from Linux to OpenBSD, the program crashes again. But this time, it tells me where the memory corruption occurs:

......
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000014b07f01e52d in addMod (r=<error reading variable>, a=4693443247995522, b=28622907746665631,
......

I figure out the issue quickly, and not bother to understand the whole code. OpenBSD saves me again, thanks!