Switch to https when git protocol doesn’t work

For many reasons (such as Firewall), you can’t clone from remote server’s git port(by default: 9418) correctly. For example:

# git clone git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux -b perf/core
Cloning into 'linux'...

From the captured packet:

CSf6q

You can see the TCP connection is established, then no any response! You can switch to https or http protocol, it may save your life:

# git clone https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux -b perf/core
Cloning into 'linux'...
POST git-upload-pack (gzip 25015 to 12570 bytes)
remote: Counting objects: 5287534, done.
......

Please refer the discussion here.

Use perf and FlameGraph to profile program on Linux

In most Linux environments, the perf tools should be set up by default. Otherwise, you can install it manually. E.g., in ArchLinux:

# pacman -S perf

Use following program as an example (It is a rifacimento from here, and you should only focus on the framework of the code):

# cat test.cpp
#include <NTL/ZZX.h>

using namespace std;
using namespace NTL;

void inner(int i, ZZX& t, Vec<ZZX>& phi)
{
        for (long j = 1; j <= i-1; j++)
         if (i % j == 0)
            t *= phi(j);
}

void outer(int i, Vec<ZZX>& phi)
{
        ZZX t;
        t = 1;
        inner(i, t, phi);
        phi(i) = (ZZX(INIT_MONO, i) - 1)/t;
        cout << phi(i) << "\n";
}

int main()
{
   Vec<ZZX> phi(INIT_SIZE, 100);

   for (long i = 1; i <= phi.length(); i++) {
      outer(i, phi);
   }
}

Compile it:

# g++ -g -O2 -pthread test.cpp -lntl -lgmp

It is suggested that using -g -O2 options since -g can provide debug information which perf needs and -O2 can generate lots of optimizations.

Use perf record to sample the program:

# perf record --call-graph dwarf ./a.out
......
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.318 MB perf.data (38 samples) ]

To profile an already running program, use -p pid flag. A perf.data file will be generated in current directory, and you can use perf report command to parse it:

# perf report

The detailed information of every function will be showed:

Capture

Another awesome tool is FlameGraph which is used to analyze stack call traces:

# git clone --depth 1 https://github.com/brendangregg/FlameGraph
# cd FlameGraph

Copy perf.data into current directory:

# cp ../perf.data ./

Execute following command:

# perf script | ./stackcollapse-perf.pl |./flamegraph.pl > perf.svg

The perf.svg is like this:

FlameGraph

You can see the whole stack frameworks and functions’ consume time ratio.

P.S., the full code is here.

Use “.cu” as file extension name when playing Thrust

Today, I tried the simple Thrust program:

$ cat a.c
#include <thrust/host_vector.h>
#include <thrust/device_vector.h>
#include <iostream>

int main(void) {
        // H has storage for 4 integers
        thrust::host_vector<int> H(4);

        // initialize individual elements
        H[0] = 14;
        H[1] = 20;
        H[2] = 38;
        H[3] = 46;

        // H.size() returns the size of vector H
        std::cout << "H has size " << H.size() << std::endl;

        // print contents of H
        for(int i = 0; i < H.size(); i++)
                std::cout << "H[" << i << "] = " << H[i] << std::endl;

        // resize H
        H.resize(2);
        std::cout << "H now has size " << H.size() << std::endl;

        // Copy host_vector H to device_vector D
        thrust::device_vector<int> D = H;

        // elements of D can be modified
        D[0] = 99;
        D[1] = 88;

        // print contents of D
        for(int i = 0; i < D.size(); i++)
                std::cout << "D[" << i << "] = " << D[i] << std::endl;

        // H and D are automatically deleted when the function returns
        return 0;
}

Built it:

$ nvcc -arch=sm_37 a.c
In file included from a.c:1:0:
/opt/cuda/bin/..//include/thrust/host_vector.h:25:18: fatal error: memory: No such file or directory
compilation terminated.

It seemed very weird! After scanning Thrust’s FAQ, I came across the following tip:

Make sure that files that #include Thrust have a .cu extension. Other extensions (e.g., .cpp) will cause nvcc to treat the file incorrectly and produce an error message.

Renamed the source file name and rebuilt it:

$ mv a.c a.cu
$ nvcc -arch=sm_37 a.cu
$ ./a.out
H has size 4
H[0] = 14
H[1] = 20
H[2] = 38
H[3] = 46
H now has size 2
D[0] = 99
D[1] = 88

Worked like a charm!

Use Source Insight as the editor to develop Unix softwares

Source Insight is my favorite editor, and I have used it for more than 10 years. But when employing it to develop Unix software, you will run into annoying line break issue, which is on windows, the newline is \r\n while in Unix it is \n only. Therefore you will see the file edited in Source Insight will display an extra ^M in Unix environment:

#include <stdio.h>^M
^M
int main(void)^M
{^M
        printf("\r\n");^M
}^M

To resolve this problem, you can refer this topic in stackoverflow:

To save a file with a specific end-of-line type in Source Insight, select File -> Save As…, then where it says “Save as type”, select the desired end-of-line type.

To set the end-of-line type for new files you create in Source Insight, select Options -> Preferences and click the Files tab. Where it says “Default file format” select the desired end-of-line type.

So you can set Unix file format as you wanted:

Capture

Another caveat you should pay attention is if you use git Windows client, by default, it will convert the newline of project from \n to \r\n directly. My solution is just disabling this auto conversion feature:

git config --global core.autocrlf false

Use “-g -O2” option when employ gcc to compile your project

Honestly, I seldom used “-g -O2” option before when employ gcc to compile projects since optimization will cause source code and instructions can’t be consistent, and it makes you annoyed during debugging. But in the past half a year, since my work involves a lot of encryption & decryption jobs, and they are all compute intensive tasks, I find the -O2 option can really give a big improvement in performance.

For example, when the project is compiled with only -g option, the whole computation flow will last more than 30 minutes, but after using “-g -O2” option, the time is reduced to less than 3 minutes, and the whole performance is 10x times improved than before.

So when you care about your program’s performance, you should try to use “-g -O2” option: the -g option can enlarge your executable file size, but won’t make it run slow, and once the program crashes, it can also provide you enough debug information; the -O2 is the “best safe optimization” option. Besides these, you may run into some tricky bugs which occur only during optimization code.

Hope you can try and enjoy it!

References:
Using -g and -O2 options in gcc;
Is a program compiled with -g gcc flag slower than the same program compiled without -g?.