Forgetting “-pthread” option may give you a big surprise!

Today, I wrote a small pthread program to do some testing:

#include <pthread.h>

int main(void)
{
        pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
        pthread_cond_t cv = PTHREAD_COND_INITIALIZER;

        pthread_mutex_lock(&mutex);
        pthread_cond_wait(&cv, &mutex);
        return 0;
} 

Build and test it on OpenBSD-current (version is 6.4):

# cc cv_test.c -o cv_test
# ./cv_test

The program will block there and it is my expected result. Switch to Arch Linux (kernel version is 4.18.9):

# cc cv_test.c -o cv_test
# ./cv_test
#

The program will exit immediately. I doubt it is “spurious awake” firstly, but can’t get a convincing explanation. Using ldd to check program. On OpenBSD:

# ldd cv_test
cv_test:
        Start            End              Type  Open Ref GrpRef Name
        000000d4c3a00000 000000d4c3c02000 exe   1    0   0      cv_test
        000000d6e6007000 000000d6e62f6000 rlib  0    1   0      /usr/lib/libc.so.92.5
        000000d6db100000 000000d6db100000 ld.so 0    1   0      /usr/libexec/ld.so

On Arch Linux:

# ldd cv_test
        linux-vdso.so.1 (0x00007ffde91c6000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007f3e3169b000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f3e3187a000)

Nothing special. After seeking help on stackoverflow, the answer is I need adding -pthread option:

# cc -pthread cv_test.c -o cv_test
# ./cv_test

This time it worked perfectly. Checking linked library:

# ldd cv_test
        linux-vdso.so.1 (0x00007fff48be8000)
        libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fa46f84c000)
        libc.so.6 => /usr/lib/libc.so.6 (0x00007fa46f688000)
        /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fa46f888000)

Why doesn’t Linux give me a link error which prompts I need link libpthread? It seems not make sense.

configure script may not check pthread correctly on OpenBSD

I have come into at least 2 times that one project was built well on Linux, while can’t find pthread related definitions on OpenBSD, like this:

......
../../runtime/cilk-internal.h:39:6: error: unknown type name 'pthread_mutex_t'
     pthread_mutex_t posix;
     ^
../../runtime/cilk-internal.h:211:6: error: unknown type name 'pthread_t'
     pthread_t *tid;
     ^
../../runtime/cilk-internal.h:216:6: error: unknown type name 'pthread_cond_t'
     pthread_cond_t  waiting_workers_cond;
     ^
../../runtime/cilk-internal.h:217:6: error: unknown type name 'pthread_cond_t'
     pthread_cond_t  wakeup_first_worker_cond;
     ^
../../runtime/cilk-internal.h:218:6: error: unknown type name 'pthread_cond_t'
     pthread_cond_t  wakeup_other_workers_cond;
     ^
../../runtime/cilk-internal.h:219:6: error: unknown type name 'pthread_mutex_t'
     pthread_mutex_t workers_mutex;
     ^
../../runtime/cilk-internal.h:220:6: error: unknown type name 'pthread_cond_t'
     pthread_cond_t  workers_done_cond;
......

The source code is as following:

......
#if HAVE_PTHREAD
#include <pthread.h>
#endif
......

While the generated config.h doesn’t define HAVE_PTHREAD macro:

/* Define if you have POSIX threads libraries and header files. */
/* #undef HAVE_PTHREAD */

But in fact, the OpenBSD has provided all support of pthread. So please be aware of this issue.

First taster of Standard C++ Thread Library on OpenBSD

Today I tried Standard C++ Thread Library on OpenBSD, since it requires the compiler to support C++11 standard, and the default c++ only support C++98 (please refer here), so I need to switch to clang++. The program is just a classic “Hello World”:

#include <thread>
#include <iostream>

void hello()
{
    std::cout << "Hello World!\n";
}

int main(void)
{
    std::thread t(hello);
    t.join();
    return 0;
}

Built and run it:

# clang++ -std=c++11 hello.cpp
root:/root/Project# ./a.out
terminate called after throwing an instance of 'std::system_error'
  what():  Enable multithreading to use std::thread: Operation not permitted
Abort trap (core dumped)

Whoops! The program crashed. After reading this post, adding -pthread during compilation fixed this issue:

# clang++ -pthread -std=c++11 hello.cpp
# ./a.out
Hello World!

The pitfalls of using OpenMP parallel for-loops

According to this discussion:

#pragma omp parallel for
for (...)
{
}

is a shortcut of

#pragma omp parallel
{ 
#pragma omp for
    for (...)
    {
    }
}

and it seems more convenient of using “#pragma omp parallel for“. But there are some pitfalls which you should pay attention to:

(1) You can’t assume the number of threads will be equal to for-loops iteration counts even it is very small. For example (The machine has only cores.):

#include <omp.h>
#include <stdio.h>

int main(void) {
#pragma omp parallel for
    for (int i = 0; i < 5; i++) {
        printf("thread is %d\n", omp_get_thread_num());
    }
    return 0;
}

Build and run this program:

# gcc -fopenmp parallel.c
# ./a.out
thread is 0
thread is 0
thread is 0
thread is 1
thread is 1

We can see only 2 threads are generated. Run it in another 32-core machine:

# ./a.out
thread is 1
thread is 0
thread is 2
thread is 4
thread is 3

We can see 5 threads are launched.

(2) Use num_threads clause to modify the program as following:

#include <omp.h>
#include <stdio.h>

int main(void) {
#pragma omp parallel for num_threads(5)
    for (int i = 0; i < 5; i++) {
        printf("thread is %d\n", omp_get_thread_num());
    }
    return 0;
}

Rebuild and run it on original 2-core machine:

# gcc -fopenmp parallel.c
# ./a.out
thread is 2
thread is 3
thread is 4
thread is 1
thread is 0

We can see this time 5 threads are created. But you should notice the actual thread count depends the system resource. E.g., change the code like this:

#pragma omp parallel for num_threads(30000)
    for (int i = 0; i < 30000; i++) {
        printf("thread is %d\n", omp_get_thread_num());
    }

Execute it:

# ./a.out

libgomp: Thread creation failed: Resource temporarily unavailable

So we should notice the the created thread number.

P.S., if the iteration number is smaller than core number, the number of threads will be equal to core number (still in 32-core machine):

#include <omp.h>
#include <stdio.h>

int main(void) {
#pragma omp parallel for
    for (int i = 0; i < 4; i++) {
        if (0 == omp_get_thread_num()) {
            printf("thread number is %d\n", omp_get_num_threads());
        }
    }
    return 0;
}

The output is:

thread number is 32

(3) If you use C++ thread_local variable, you should take care:

#include <omp.h>
#include <stdio.h>

int main(void) {
    thread_local int array[5] = {0};
#pragma omp parallel for num_threads(5)
    for (int i = 0; i < 5; i++) {
        array[i] = i + 1;
    }

    for (int i = 0; i < 5; i++) {
        printf("array[%d] is %d\n", i, array[i]);
    }
    return 0;
}

Compile and run:

# g++ -fopenmp parallel.cpp
# ./a.out
array[0] is 1
array[1] is 0
array[2] is 0
array[3] is 0
array[4] is 0

We can see only the first element is changed, so it must be thread 0‘s work. Remove the thread_local qualifier, and rebuild. This time you get the wanted result:

# ./a.out
array[0] is 1
array[1] is 2
array[2] is 3
array[3] is 4
array[4] is 5

Build FFTW supporting multi-thread

According to FFTW document:

Starting from FFTW-3.3.5, FFTW supports a new API to make the planner thread-safe:

void fftwmakeplannerthreadsafe(void);

This call installs a hook that wraps a lock around all planner calls.

To use fftw_make_planner_thread_safe() API, you need to add --enable-threads in configuration stage:

$ ./configure --enable-threads
$ make  
$ sudo make install

There will be two additional libraries (libfftw3_threads.a and libfftw3_threads.la) generated compared with no --enable-threads option:

$ cd /usr/local/lib/
$ ls -alt libfftw*
-rw-r--r-- 1 root root   44528 Jan 16 09:18 libfftw3_threads.a
-rwxr-xr-x 1 root root     946 Jan 16 09:18 libfftw3_threads.la
-rw-r--r-- 1 root root 2222828 Jan 16 09:18 libfftw3.a
-rwxr-xr-x 1 root root     886 Jan 16 09:18 libfftw3.la

BTW, although fftw_make_planner_thread_safe() API was introduced in FFTW-3.3.5, it didn’t really work until FFTW-3.3.6 who contains an important patch. Please refer the discussion and release note.