Use “-g -O2” option when employ gcc to compile your project

Honestly, I seldom used “-g -O2” option before when employ gcc to compile projects since optimization will cause source code and instructions can’t be consistent, and it makes you annoyed during debugging. But in the past half a year, since my work involves a lot of encryption & decryption jobs, and they are all compute intensive tasks, I find the -O2 option can really give a big improvement in performance.

For example, when the project is compiled with only -g option, the whole computation flow will last more than 30 minutes, but after using “-g -O2” option, the time is reduced to less than 3 minutes, and the whole performance is 10x times improved than before.

So when you care about your program’s performance, you should try to use “-g -O2” option: the -g option can enlarge your executable file size, but won’t make it run slow, and once the program crashes, it can also provide you enough debug information; the -O2 is the “best safe optimization” option. Besides these, you may run into some tricky bugs which occur only during optimization code.

Hope you can try and enjoy it!

References:
Using -g and -O2 options in gcc;
Is a program compiled with -g gcc flag slower than the same program compiled without -g?.

A “std::bad_alloc” issue caused by typo

Last week, I fixed a bug which was caused by a typo. The simplified code is like this:

#include <vector>
using namespace std;

class A {
    int i;
public:
    A(int i): i(i) {}
};

class B {
    vector<A> v;
public:
    B(vector<A> v1): v(v) {}
};

int main() {
    vector<A> a(1, 0);
    B b(a);
    return 0;
}

Please note the constructor of B:

B(vector<A> v1): v(v) {}

It was supposed to use v1 to initialize v, while I misspelled: v(v). My compiler is gcc 6.3.1, compile and run it:

# g++ -g test.cpp
# ./a.out
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

Change B(vector<A> v1): v(v) to B(vector<A> v1): v(v1), then all is OK.

The caveat of building OpenMP program

When building OpenMP program, you must be sure to use -fopenmp option in both compile and link stage (refer stackoverflow), else you may get a hit.

Take the following example:

#include <unistd.h>
#include <omp.h>

int main(void){

        #pragma omp parallel num_threads(4)
        for(;;)
        {
            sleep(1);
        }

        return 0;
}

Use gcc to build it (contains both compile and link):

gcc -fopenmp parallel.c

Execute the program and check the threads number:

$ ./a.out &
[1] 5684
$ ps -T 5684
  PID  SPID TTY      STAT   TIME COMMAND
 5684  5684 pts/16   Sl     0:00 ./a.out
 5684  5685 pts/16   Sl     0:00 ./a.out
 5684  5686 pts/16   Sl     0:00 ./a.out
 5684  5687 pts/16   Sl     0:00 ./a.out

There are 4 threads which as our expected.

Then we create a neat Makefile and split the compile and link stages separately:

all:
        gcc -fopenmp -c parallel.c -o parallel.o
        gcc parallel.o
clean:
        rm *.o a.out

Run the Makefie:

$ make
gcc -fopenmp -c parallel.c -o parallel.o
gcc parallel.o
parallel.o: In function `main':
parallel.c:(.text+0x19): undefined reference to `GOMP_parallel'
collect2: error: ld returned 1 exit status
make: *** [Makefile:3: all] Error 1

During the link phase, the gcc complained it can’t find GOMP_parallel. So we need to add -fopenmp in link command too:

all:
        gcc -fopenmp -c parallel.c -o parallel.o
        gcc -fopenmp parallel.o
clean:
        rm *.o a.out

This time all is OK:

$ make
gcc -fopenmp -c parallel.c -o parallel.o
gcc -fopenmp parallel.o
$ ./a.out &
[2] 6502
$ ps -T 6502
  PID  SPID TTY      STAT   TIME COMMAND
 6502  6502 pts/16   Sl     0:00 ./a.out
 6502  6503 pts/16   Sl     0:00 ./a.out
 6502  6504 pts/16   Sl     0:00 ./a.out
 6502  6505 pts/16   Sl     0:00 ./a.out

You can also use ldd tool to check a.out‘s dynamic libraries:

$ ldd /usr/lib/libgomp.so.1
    linux-vdso.so.1 (0x00007ffd9c0dd000)
    libgomp.so.1 => /usr/lib/libgomp.so.1 (0x00007fe5554ee000)
    libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007fe5552d0000)
    libc.so.6 => /usr/lib/libc.so.6 (0x00007fe554f2c000)
    libdl.so.2 => /usr/lib/libdl.so.2 (0x00007fe554d28000)
    /lib64/ld-linux-x86-64.so.2 (0x00007fe55571c000)

The libgomp includes the GOMP_parallel definition.

Why doesn’t “~ /.profile” take effect in Arch Linux?

I followed Rust tutorial to install Rust on my Arch Linux, and found the Rust directory is indeed added into ~/.profile file:

$ cat ~/.profile

export PATH="$HOME/.cargo/bin:$PATH"

But after re-login, I couldn’t see Rust folder is in $PATH variable:

$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/cuda/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl

It didn’t take effect!

After reading Arch Linux Bash document, I noticed the following statement which describes ~/.bash_profile:

If this file does not exist, ~/.bash_login and ~/.profile are checked in that order.

So this means that if ~/.bash_profile exists,~/.bash_login and ~/.profile won’t be checked, right? Let’s do a check:

(1) Use strace command to record the files’ names related to current user when logining.

$ strace -o out -e open bash -l
$ grep "/home/xiaonan" out
open("/home/xiaonan/.bash_profile", O_RDONLY) = 3
open("/home/xiaonan/.bashrc", O_RDONLY) = 3
open("/home/xiaonan/.bash_history", O_RDONLY) = 3
open("/home/xiaonan/.bash_history", O_RDONLY) = 3
$ exit
logout

As expected, the ~/.profile, i.e., /home/xiaonan/.profile wasn’t opened.

(2) Renamed ~/.bash_profile to pretend it didn’t exist, checked whether ~/.profile would be read:

$ mv .bash_profile .bash_profile.bak
$ strace -o out -e open bash -l
$ grep "/home/xiaonan" out
open("/home/xiaonan/.bash_profile", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/xiaonan/.bash_login", O_RDONLY) = -1 ENOENT (No such file or directory)
open("/home/xiaonan/.profile", O_RDONLY) = 3
open("/home/xiaonan/.bash_history", O_RDONLY) = 3
open("/home/xiaonan/.bash_history", O_RDONLY) = 3
$ echo $PATH
/home/xiaonan/.cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/cuda/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
$ exit
logout

As things turned out, when the ~/.bash_profile didn’t exist:

open("/home/xiaonan/.bash_profile", O_RDONLY) = -1 ENOENT (No such file or directory)

The bash would access ~/.bash_login and ~/.profile in sequence, and $HOME/.cargo/bin was added into $PATH finally.

The solution of this issue is adding following statement in ~/.bash_profile:

[[ -f ~/.profile ]] && . ~/.profile

Now it works!

What is the effect of extern “C”?

I often see the following code in C header files:

#ifdef __cplusplus
extern "C" {
#endif

......

#ifdef __cplusplus
}
#endif

What does it mean? Since there is __cplusplus marco, it must be related to C++ compilation. Let’s see a simple program (print.c):

$ cat print.c
#include <stdio.h>

void print(void)
{
        printf("Hello world!\n");
}

Use gcc to generate object file:

$ gcc -c print.c
$ 

Then create a main.cpp file which calls print() in its main() function:

$ cat main.cpp
extern void print(void);

int main(void)
{
        print();
        return 0;
}

Compile main.cpp and link with print.o:

$ g++ main.cpp print.o
/tmp/cc60fu19.o: In function `main':
main.cpp:(.text+0x5): undefined reference to `print()'
collect2: error: ld returned 1 exit status

It is weird, right? the print() function must be defined in print.o, why can’t g++ find it? Let’s do a simple magic, add "C" in extern void print(void);:

$ cat main.cpp
extern "C" void print(void);

int main(void)
{
        print();
        return 0;
}

Try compile main.cpp again:

$ g++ main.cpp print.o
$ ./a.out
Hello world!

It is OK now! The root cause is related to name mangling. To be simplified, when compile C++ code, the names of functions, global variables, etc will be modified, not the same as original format. While compile C code, this won’t happen. The function of extern "C" is to tell C++ compiler search the original name, not the mangled ones. To get a sense of name mangling, you can check the print() name in object file:

$ readelf -s print.o | grep print
 1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS print.c
 9: 0000000000000000    17 FUNC    GLOBAL DEFAULT    1 print

Then use g++ to compile print.c, and check function name again:

$ g++ -c print.c
$ readelf -s print.o | grep print
 1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS print.c
 9: 0000000000000000    17 FUNC    GLOBAL DEFAULT    1 _Z5printv

You can see the print() function name is actually _Z5printv when use g++ to generate the object file.

References:
Why do we use extern “C”?;
In C++ source, what is the effect of extern ā€œCā€?.