Different RVO behaviors between gcc and clang

Regarding RVO (Return Value Optimization), I think this video gives a real good explanation. Let’s cut to the chase. Check following code:

# cat rvo.cpp
#include <iostream>

class Foo
{
public:
        Foo(){std::cout << "Foo default constructor!\n";}
        ~Foo(){std::cout << "Foo destructor!\n";}
        Foo(const Foo&){std::cout << "Foo copy constructor!\n";}
        Foo& operator=(const Foo&){std::cout << "Foo assignment!\n"; return *this;}
        Foo(const Foo&&){std::cout << "Foo move constructor!\n";}
        Foo& operator=(const Foo&&){std::cout << "Foo move assignment!\n"; return *this;}
};

Foo func(bool flag)
{
        Foo temp;
        if (flag)
        {
                std::cout << "if\n";
        }
        else
        {
                std::cout << "else\n";
        }
        return temp;
}

int main()
{
        auto f = func(true);
        return 0;
}

On my Arch Linux platform, gcc version is 8.2.1 and clang version is 8.0.0. I tried to use -std=c++11, -std=c++14, -std=c++17 and -std=c++2a for both compilers, all generated same output:

Foo default constructor!
if
Foo destructor!

So both compilers are clever enough to realize there is no need to create “Foo temp” variable (please refer Small examples show copy elision in C++). Modify the func a little:

Foo func(bool flag)
{
        if (flag)
        {
                Foo temp;
                std::cout << "if\n";
                return temp;
        }
        else
        {
                Foo temp;
                std::cout << "else\n";
                return temp;
        }
}

This time, For clang compiler (All four compiling options: -std=c++11, -std=c++14, -std=c++17 and -std=c++2a), the program generated output as same as above:

Foo default constructor!
if
Foo destructor!

But for gcc, (All four compiling options: -std=c++11, -std=c++14, -std=c++17 and -std=c++2a), the program generated different output:

Foo default constructor!
if
Foo move constructor!
Foo destructor!
Foo destructor!

So it means gcc generated both two variables: “Foo temp” and “auto f“. It not only means clang does better optimization than gcc in this scenario, but means if you do something in move constructor, and expect it should be called. You program logic will depend on the compiler: it works under gcc but not for clang. Beware of this trap, otherwise it may bite you one day, like me today!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.