Invoke profile function in Nsight

When using Nsight to develop CUDA program, you can use profile function to profile the program:

1

You can also toggle the C/C++ and profile view in the right corner:

2

BTW, if you only want to profile a part of the program (not the whole), you can usecudaProfilerStart() and cudaProfilerStop to surround the code, then untick “Start execution with profiling enabled” in “Profile Configuration“:

Process large data in external memory

This week, I implemented a small program which handles a data set. The volume of data set is so big that it can’t be stored in main memory.

I first tried to use stxxlstxxl is an awesome library which mimics STL and processes data in external memory. But it has many limitations, such as data type should be plain old data type. Since my type doesn’t provide default constructor, stxxl can’t satisfy my need (please refer this discussion). I also make attempts on other workaruonds, but all failed.

Finally, I used a simple method: Open a file, serialize the data set into file, and treat the file like the main memory. Although it is not the most efficient approach, the program is vey clear, and not prone to bugs. So I decide to use it as a demo, and improve it gradually.

Update: Split large file into smaller ones, and use multiple threads to handle them is a good idea.

 

Be careful of file sequence in linking process

Check following A.h:

# cat A.h
#pragma once

#include <iostream>
#include <vector>

class A
{
public:
        std::vector<int> v;
        A()
        {
                v.push_back(1);
                std::cout << "Enter A's constructor...\n";
        }
        int getFirstElem()
        {
                v.push_back(2);
                std::cout << "Enter A's getFirstElem...\n";
                return v[0];
        }
        ~A()
        {
                std::cout << "Enter A's destructor...\n";
        }
};

int func();

And A.cpp:

# cat A.cpp
#include "A.h"

static A a;

int func()
{
        return a.getFirstElem();
}

The A.cpp just define a A‘s static instance, and a func() returns first element in a‘s internal vector.

Check another file which utilizes A.h and A.cpp:

# cat hello.cpp
#include <iostream>
#include "A.h"

static int gP = func();

int main()
{
    std::cout << gP << std::endl;
    return 0;
}

Compile them:

# clang++ -c hello.cpp
# clang++ -c A.cpp

Link hello.o first and execute the program:

# clang++ hello.o A.o
# ./a.out
Enter A's getFirstElem...
Enter A's constructor...
2
Enter A's destructor...

Then link A.o first and execute the program:

# clang++ A.o hello.o
# ./a.out
Enter A's constructor...
Enter A's getFirstElem...
1
Enter A's destructor...

The results are different. In first case, when call a‘s getFirstElem() function, its constructor is not even called. Please pay attention to it!

Clang seems to be generating more user-friendly error message than gcc

Check following simple C++ program:

#include <string>
#include <utility>

class A 
{
public:
    A (int a) {};
};

int main()
{

    std::pair<std::string, A> p;
    return 0;
}

Compile it with newest gcc 7.3.0, following errors are generated:

$ g++ test.cpp
test.cpp: In function ‘int main()’:
test.cpp:13:31: error: no matching function for call to ‘std::pair<std::__cxx11::basic_string<char>, A>::pair()’
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:431:9: note: candidate: template<class ... _Args1, long unsigned int ..._Indexes1, class ... _Args2, long unsigned int ..._Indexes2> std::pair<_T1, _T2>::pair(std::tuple<_Args1 ...>&, std::tuple<_Args2 ...>&, std::_Index_tuple<_Indexes1 ...>, std::_Index_tuple<_Indexes2 ...>)
         pair(tuple<_Args1...>&, tuple<_Args2...>&,
         ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:431:9: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 4 arguments, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:364:9: note: candidate: template<class ... _Args1, class ... _Args2> std::pair<_T1, _T2>::pair(std::piecewise_construct_t, std::tuple<_Args1 ...>, std::tuple<_Args2 ...>)
         pair(piecewise_construct_t, tuple<_Args1...>, tuple<_Args2...>);
         ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:364:9: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 3 arguments, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:359:21: note: candidate: template<class _U1, class _U2, typename std::enable_if<(std::_PCC<((! std::is_same<std::__cxx11::basic_string<char>, _U1>::value) || (! std::is_same<A, _U2>::value)), std::__cxx11::basic_string<char>, A>::_MoveConstructiblePair<_U1, _U2>() && (! std::_PCC<((! std::is_same<std::__cxx11::basic_string<char>, _U1>::value) || (! std::is_same<A, _U2>::value)), std::__cxx11::basic_string<char>, A>::_ImplicitlyMoveConvertiblePair<_U1, _U2>())), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(std::pair<_U1, _U2>&&)
  explicit constexpr pair(pair<_U1, _U2>&& __p)
                     ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:359:21: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 1 argument, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:349:12: note: candidate: template<class _U1, class _U2, typename std::enable_if<(std::_PCC<((! std::is_same<std::__cxx11::basic_string<char>, _U1>::value) || (! std::is_same<A, _U2>::value)), std::__cxx11::basic_string<char>, A>::_MoveConstructiblePair<_U1, _U2>() && std::_PCC<((! std::is_same<std::__cxx11::basic_string<char>, _U1>::value) || (! std::is_same<A, _U2>::value)), std::__cxx11::basic_string<char>, A>::_ImplicitlyMoveConvertiblePair<_U1, _U2>()), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(std::pair<_U1, _U2>&&)
  constexpr pair(pair<_U1, _U2>&& __p)
            ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:349:12: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 1 argument, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:339:21: note: candidate: template<class _U1, class _U2, typename std::enable_if<(_MoveConstructiblePair<_U1, _U2>() && (! _ImplicitlyMoveConvertiblePair<_U1, _U2>())), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(_U1&&, _U2&&)
  explicit constexpr pair(_U1&& __x, _U2&& __y)
                     ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:339:21: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 2 arguments, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:330:12: note: candidate: template<class _U1, class _U2, typename std::enable_if<(_MoveConstructiblePair<_U1, _U2>() && _ImplicitlyMoveConvertiblePair<_U1, _U2>()), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(_U1&&, _U2&&)
  constexpr pair(_U1&& __x, _U2&& __y)
            ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:330:12: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 2 arguments, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:321:17: note: candidate: template<class _U2, typename std::enable_if<_CopyMovePair<false, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, _U2>(), bool>::type <anonymous> > std::pair<_T1, _T2>::pair(const _T1&, _U2&&)
        explicit pair(const _T1& __x, _U2&& __y)
                 ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:321:17: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 2 arguments, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:314:18: note: candidate: template<class _U2, typename std::enable_if<_CopyMovePair<true, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, _U2>(), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(const _T1&, _U2&&)
        constexpr pair(const _T1& __x, _U2&& __y)
                  ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:314:18: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 2 arguments, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:307:27: note: candidate: template<class _U1, typename std::enable_if<_MoveCopyPair<false, _U1, A>(), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(_U1&&, const _T2&)
        explicit constexpr pair(_U1&& __x, const _T2& __y)
                           ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:307:27: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 2 arguments, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:300:18: note: candidate: template<class _U1, typename std::enable_if<_MoveCopyPair<true, _U1, A>(), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(_U1&&, const _T2&)
        constexpr pair(_U1&& __x, const _T2& __y)
                  ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:300:18: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 2 arguments, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:293:17: note: candidate: std::pair<_T1, _T2>::pair(std::pair<_T1, _T2>&&) [with _T1 = std::__cxx11::basic_string<char>; _T2 = A]
       constexpr pair(pair&&) = default;
                 ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:293:17: note:   candidate expects 1 argument, 0 provided
/usr/include/c++/7.3.0/bits/stl_pair.h:292:17: note: candidate: std::pair<_T1, _T2>::pair(const std::pair<_T1, _T2>&) [with _T1 = std::__cxx11::basic_string<char>; _T2 = A]
       constexpr pair(const pair&) = default;
                 ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:292:17: note:   candidate expects 1 argument, 0 provided
/usr/include/c++/7.3.0/bits/stl_pair.h:289:21: note: candidate: template<class _U1, class _U2, typename std::enable_if<(std::_PCC<((! std::is_same<std::__cxx11::basic_string<char>, _U1>::value) || (! std::is_same<A, _U2>::value)), std::__cxx11::basic_string<char>, A>::_ConstructiblePair<_U1, _U2>() && (! std::_PCC<((! std::is_same<std::__cxx11::basic_string<char>, _U1>::value) || (! std::is_same<A, _U2>::value)), std::__cxx11::basic_string<char>, A>::_ImplicitlyConvertiblePair<_U1, _U2>())), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(const std::pair<_U1, _U2>&)
  explicit constexpr pair(const pair<_U1, _U2>& __p)
                     ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:289:21: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 1 argument, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:280:19: note: candidate: template<class _U1, class _U2, typename std::enable_if<(std::_PCC<((! std::is_same<std::__cxx11::basic_string<char>, _U1>::value) || (! std::is_same<A, _U2>::value)), std::__cxx11::basic_string<char>, A>::_ConstructiblePair<_U1, _U2>() && std::_PCC<((! std::is_same<std::__cxx11::basic_string<char>, _U1>::value) || (! std::is_same<A, _U2>::value)), std::__cxx11::basic_string<char>, A>::_ImplicitlyConvertiblePair<_U1, _U2>()), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(const std::pair<_U1, _U2>&)
         constexpr pair(const pair<_U1, _U2>& __p)
                   ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:280:19: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 1 argument, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:258:26: note: candidate: template<class _U1, class _U2, typename std::enable_if<(_ConstructiblePair<_U1, _U2>() && (! _ImplicitlyConvertiblePair<_U1, _U2>())), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(const _T1&, const _T2&)
       explicit constexpr pair(const _T1& __a, const _T2& __b)
                          ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:258:26: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 2 arguments, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:249:17: note: candidate: template<class _U1, class _U2, typename std::enable_if<(_ConstructiblePair<_U1, _U2>() && _ImplicitlyConvertiblePair<_U1, _U2>()), bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair(const _T1&, const _T2&)
       constexpr pair(const _T1& __a, const _T2& __b)
                 ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:249:17: note:   template argument deduction/substitution failed:
test.cpp:13:31: note:   candidate expects 2 arguments, 0 provided
     std::pair<std::string, A> p;
                               ^
In file included from /usr/include/c++/7.3.0/bits/stl_algobase.h:64:0,
                 from /usr/include/c++/7.3.0/bits/char_traits.h:39,
                 from /usr/include/c++/7.3.0/string:40,
                 from test.cpp:1:
/usr/include/c++/7.3.0/bits/stl_pair.h:231:26: note: candidate: template<class _U1, class _U2, typename std::enable_if<std::__and_<std::is_default_constructible<_Tp>, std::is_default_constructible<_U2>, std::__not_<std::__and_<std::__is_implicitly_default_constructible<_U1>, std::__is_implicitly_default_constructible<_U2> > > >::value, bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair()
       explicit constexpr pair()
                          ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:231:26: note:   template argument deduction/substitution failed:
/usr/include/c++/7.3.0/bits/stl_pair.h:230:59: error: no type named ‘type’ in ‘struct std::enable_if<false, bool>’
                                    ::value, bool>::type = false>
                                                           ^~~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:230:59: note: invalid template non-type parameter
/usr/include/c++/7.3.0/bits/stl_pair.h:218:26: note: candidate: template<class _U1, class _U2, typename std::enable_if<std::__and_<std::__is_implicitly_default_constructible<_U1>, std::__is_implicitly_default_constructible<_U2> >::value, bool>::type <anonymous> > constexpr std::pair<_T1, _T2>::pair()
       _GLIBCXX_CONSTEXPR pair()
                          ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:218:26: note:   template argument deduction/substitution failed:
/usr/include/c++/7.3.0/bits/stl_pair.h:216:59: error: no type named ‘type’ in ‘struct std::enable_if<false, bool>’
                                    ::value, bool>::type = true>
                                                           ^~~~
/usr/include/c++/7.3.0/bits/stl_pair.h:216:59: note: invalid template non-type parameter

Honestly, I’m totally lost in the error messages and can’t find the root cause easily. Whereas using clang 5.0.1 to build it:

$ clang++ test.cpp
In file included from test.cpp:1:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/7.3.0/../../../../include/c++/7.3.0/string:40:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/7.3.0/../../../../include/c++/7.3.0/bits/char_traits.h:39:
In file included from /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/7.3.0/../../../../include/c++/7.3.0/bits/stl_algobase.h:64:
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/7.3.0/../../../../include/c++/7.3.0/bits/stl_pair.h:219:18: error: no matching
      constructor for initialization of 'A'
      : first(), second() { }
                 ^
test.cpp:13:31: note: in instantiation of member function 'std::pair<std::__cxx11::basic_string<char>, A>::pair'
      requested here
    std::pair<std::string, A> p;
                              ^
test.cpp:7:5: note: candidate constructor not viable: requires single argument 'a', but no arguments were provided
    A (int a) {};
    ^
test.cpp:4:7: note: candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 0 were
      provided
class A
      ^
1 error generated.

It is very clear and I can figure out what is the problem soon. Through this simple test, clang seems generating more user-friendly error message than gcc.

Import existing CUDA project into Nsight

The steps to import an existing CUDA project (who uses CMake) into Nsight are as following:

(1) Select File -> New -> CUDA C/C++ Project:

1

Untick “Use default location“, and select the root directory of your project.

(2) Change Build location in Properties to points to the Makefile position.

2

(3) After building successfully, right click project: Run As -> Local C/C++ Application, then select which binary you want to execute.

References:
Setting Nsight to run with existing Makefile project;
How to create Eclipse project from CMake project;
How to change make location in Eclipse.