Tag Archives: multi-thread

First taster of Standard C++ Thread Library on OpenBSD

Published / by nanxiao / Leave a Comment

Today I tried Standard C++ Thread Library on OpenBSD, since it requires the compiler to support C++11 standard, and the default c++ only support C++98 (please refer here), so I need to switch to clang++. The program is just a classic “Hello World”:

#include <thread>
#include <iostream>

void hello()
{
    std::cout << "Hello World!\n";
}

int main(void)
{
    std::thread t(hello);
    t.join();
    return 0;
}

Built and run it:

# clang++ -std=c++11 hello.cpp
root:/root/Project# ./a.out
terminate called after throwing an instance of 'std::system_error'
  what():  Enable multithreading to use std::thread: Operation not permitted
Abort trap (core dumped)

Whoops! The program crashed. After reading this post, adding -pthread during compilation fixed this issue:

# clang++ -pthread -std=c++11 hello.cpp
# ./a.out
Hello World!

The pitfalls of using OpenMP parallel for-loops

Published / by nanxiao / Leave a Comment

According to this discussion:

#pragma omp parallel for
for (...)
{
}

is a shortcut of

#pragma omp parallel
{ 
#pragma omp for
    for (...)
    {
    }
}

and it seems more convenient of using “#pragma omp parallel for“. But there are some pitfalls which you should pay attention to:

(1) You can’t assume the number of threads will be equal to for-loops iteration counts even it is very small. For example (The machine has only cores.):

#include <omp.h>
#include <stdio.h>

int main(void) {
#pragma omp parallel for
    for (int i = 0; i < 5; i++) {
        printf("thread is %d\n", omp_get_thread_num());
    }
    return 0;
}

Build and run this program:

# gcc -fopenmp parallel.c
# ./a.out
thread is 0
thread is 0
thread is 0
thread is 1
thread is 1

We can see only 2 threads are generated. Run it in another 32-core machine:

# ./a.out
thread is 1
thread is 0
thread is 2
thread is 4
thread is 3

We can see 5 threads are launched.

(2) Use num_threads clause to modify the program as following:

#include <omp.h>
#include <stdio.h>

int main(void) {
#pragma omp parallel for num_threads(5)
    for (int i = 0; i < 5; i++) {
        printf("thread is %d\n", omp_get_thread_num());
    }
    return 0;
}

Rebuild and run it on original 2-core machine:

# gcc -fopenmp parallel.c
# ./a.out
thread is 2
thread is 3
thread is 4
thread is 1
thread is 0

We can see this time 5 threads are created. But you should notice the actual thread count depends the system resource. E.g., change the code like this:

#pragma omp parallel for num_threads(30000)
    for (int i = 0; i < 30000; i++) {
        printf("thread is %d\n", omp_get_thread_num());
    }

Execute it:

# ./a.out

libgomp: Thread creation failed: Resource temporarily unavailable

So we should notice the the created thread number.

(3) If you use C++ thread_local variable, you should take care:

#include <omp.h>
#include <stdio.h>

int main(void) {
    thread_local int array[5] = {0};
#pragma omp parallel for num_threads(5)
    for (int i = 0; i < 5; i++) {
        array[i] = i + 1;
    }

    for (int i = 0; i < 5; i++) {
        printf("array[%d] is %d\n", i, array[i]);
    }
    return 0;
}

Compile and run:

# g++ -fopenmp parallel.cpp
# ./a.out
array[0] is 1
array[1] is 0
array[2] is 0
array[3] is 0
array[4] is 0

We can see only the first element is changed, so it must be thread 0‘s work. Remove the thread_local qualifier, and rebuild. This time you get the wanted result:

# ./a.out
array[0] is 1
array[1] is 2
array[2] is 3
array[3] is 4
array[4] is 5

Build FFTW supporting multi-thread

Published / by nanxiao / Leave a Comment

According to FFTW document:

Starting from FFTW-3.3.5, FFTW supports a new API to make the planner thread-safe:

void fftwmakeplannerthreadsafe(void);

This call installs a hook that wraps a lock around all planner calls.

To use fftw_make_planner_thread_safe() API, you need to add --enable-threads in configuration stage:

$ ./configure --enable-threads
$ make  
$ sudo make install

There will be two additional libraries (libfftw3_threads.a and libfftw3_threads.la) generated compared with no --enable-threads option:

$ cd /usr/local/lib/
$ ls -alt libfftw*
-rw-r--r-- 1 root root   44528 Jan 16 09:18 libfftw3_threads.a
-rwxr-xr-x 1 root root     946 Jan 16 09:18 libfftw3_threads.la
-rw-r--r-- 1 root root 2222828 Jan 16 09:18 libfftw3.a
-rwxr-xr-x 1 root root     886 Jan 16 09:18 libfftw3.la

BTW, although fftw_make_planner_thread_safe() API was introduced in FFTW-3.3.5, it didn’t really work until FFTW-3.3.6 who contains an important patch. Please refer the discussion and release note.

A trick of building multithreaded application on Solaris

Published / by nanxiao / Leave a Comment

Firstly, Let’s see a simple multithreaded application:

#include <stdio.h>
#include <pthread.h>
#include <errno.h>

void *thread1_func(void *p_arg)
{
           errno = 0;
           sleep(3);
           errno = 1;
           printf("%s exit, errno is %d\n", (char*)p_arg, errno);
}

void *thread2_func(void *p_arg)
{
           errno = 0;
           sleep(5);
           printf("%s exit, errno is %d\n", (char*)p_arg, errno);
}

int main(void)
{
        pthread_t t1, t2;

        pthread_create(&t1, NULL, thread1_func, "Thread 1");
        pthread_create(&t2, NULL, thread2_func, "Thread 2");

        sleep(10);
        return;
}

What output do you expect from this program? Per my understanding, the errnoshould be a thread-safe variable. Though The thread1_func function changes theerrno, it should not affect errno in thread2_func function.

Let’s check it on Solaris 10:

bash-3.2# gcc -g -o a a.c -lpthread
bash-3.2# ./a
Thread 1 exit, errno is 1
Thread 2 exit, errno is 1

Oh! The errno in thread2_func function is also changed to 1. Why does it happen? Let’s find the root cause from the errno.h file:

/*
 * Error codes
 */

#include <sys/errno.h>

#ifdef  __cplusplus
extern "C" {
#endif

#if defined(_LP64)
/*
 * The symbols _sys_errlist and _sys_nerr are not visible in the
 * LP64 libc.  Use strerror(3C) instead.
 */
#endif /* _LP64 */

#if defined(_REENTRANT) || defined(_TS_ERRNO) || _POSIX_C_SOURCE - 0 >= 199506L
extern int *___errno();
#define errno (*(___errno()))
#else
extern int errno;
/* ANSI C++ requires that errno be a macro */
#if __cplusplus >= 199711L
#define errno errno
#endif
#endif  /* defined(_REENTRANT) || defined(_TS_ERRNO) */

#ifdef  __cplusplus
}
#endif

#endif  /* _ERRNO_H */

We can find the errno can be a thread-safe variable(#define errno (*(___errno()))) only when the following macros defined:

defined(_REENTRANT) || defined(_TS_ERRNO) || _POSIX_C_SOURCE - 0 >= 199506L

Let’s try it:

bash-3.2# gcc -D_POSIX_C_SOURCE=199506L -g -o a a.c -lpthread
bash-3.2# ./a
Thread 1 exit, errno is 1
Thread 2 exit, errno is 0

Yes, the output is right!

From Compiling a Multithreaded Application, we can see:

For POSIX behavior, compile applications with the -D_POSIX_C_SOURCE flag set >= 199506L. For Solaris behavior, compile multithreaded programs with the -D_REENTRANT flag.

So we should pay more attentions when building multithreaded application on Solaris.

Reference:
(1) Compiling a Multithreaded Application;
(2) What is the correct way to build a thread-safe, multiplatform C library?