Function application in Haskell

As a newbie of Haskell, I find the life becomes easier once I understand function application:

(1) function application is actually “function call”. For example, define a simple add function who returns the sum of 2 numbers:

# cat add.hs
add :: Num a => a -> a -> a
add a b = a + b

Load it in ghci, and call this function:

# ghci
GHCi, version 8.2.2: http://www.haskell.org/ghc/  :? for help
Prelude> :l add
[1 of 1] Compiling Main             ( add.hs, interpreted )
Ok, one module loaded.
*Main> add 2 4
6
*Main> add 3 6
9

Beware that the tokens in function application are separated by space. So once you see following format:

a b ..

You know it is a function application, and also a “function call”.

(2) function application has the highest precedence. Check following example:

*Main> add 1 2 ^ add 1 2
27

It is equal to “(add 1 2) ^ (add 1 2)” literally.

(3) $ operator is “application operator”, and it is right associative, and has lowest precedence. Check following instance:

*Main> add 1 $ add 2 $ add 3 4
10

The $ operator divides the expression into 3 parts: “add 1“, “add 2” and add 3 4. Because $ is right associative, the result of add 3 4 is fed into add 2function first; then the result of add 2 $ add 3 4 is passed into add 1. It is equal to “add 1 ( add 2 ( add 3 4 ) )” in fact, so $ can be used to remove parentheses.

References:
Prelude;
Calling functions.

Porting software is fun and rewarding

Regarding to port software, I think there are several kinds:

a) For the simplest case, one tool is created for Linux, and you want to use it on FreeBSD. Because there is no out-of-box package for this Operating System, you grab the code and compile it yourself, no complaint from compiler. Run it and it seems work, bingo! This should be a perfect experience!

b) The life will become pleasant if everything is similar to the above case, but in reality it is definitely not. Sometimes, the progress can’t go so smoothly. Take socket programming as an example, the Solaris has some specific requirements if you are only familiar with Linux environment (Please check this post). So you may tweak the compiler options and even customoize your code to fit your requirement in this scenario.

c) The third case is you need to read the whole software source code and do modifications, and this is what I am currently doing. Back to this Monday, I received a task to verify a conception. I remembered there is an Open Source framework which has implemented similar function, so I downloaded and went through the code carefully. Fortunately, this project indeed satisfies our requirement, but since our computation environment is Nvidia GPU, I need to use CUDA APIs to replace the related code besides integrate this framework into our code repository. If no other accidents, I think I can finish the whole work in next week.

From my personal experience, porting software is really rewarding! Take this week’s work as an example, I learnt a new C++ library and refreshed my knowledge of graph data structure. Furthermore, porting software can also give you fun: after several hours even days’ hard work, a bespoken tool can meet your requirement, that will let you feel very filled!

At the end, I must declare I don’t encourage you should be lazy and don’t think problems yourself; instead you should leverage the resource rationally. Moreover, please conform to the software license, and don’t violate it.

Enjoy porting!

nvidia-smi doesn’t display GPU device in order

Today I find nvidia-smi program doesn’t display GPU devices in order:

$ nvidia-smi -L
GPU 0: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 1: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 2: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 3: Tesla V100-PCIE-16GB (UUID: GPU-...)

The above output displays V100‘s device ID is 3. While CUDA-Z display V100‘s device ID is 0:

Capture

My own lscuda also display it as the 0th device:

$ ./lscuda
CUDA Runtime Version:             9.1
CUDA Driver Version:              9.1
GPU(s):                           4
GPU Device ID:                    0
  Name:                           Tesla V100-PCIE-16GB
......

 

Ability must be ageless

Yesterday I came across this impressive video: Ability is ageless, and want to share some thoughts here.

I don’t know about other countries, whereas in China, ageism does exist more or less in software companies. I have read some stories and news about elder engineers who were laid off without convinced reasons. Employers think the elder employees have families and want more work-life balance, so they won’t work over-time without complaint like fresh graduates. Furthermore, the elder engineers are harder to manage than freshmen. I even read a job description which was like this: we don’t welcome the applicants who are older than 30, since they will lack innovation.

Regarding myself, I am 35 years old. 10 years ago, in January, 2008, I left school and got my first full-time job. In fact, currently I don’t reduce my working time in a week compared with 10 years ago. The experience accumulated in past 10 years is literally very precious, and I can ulitize it to help other younger colleagues. Besides this, I don’t come to a standstill, and keep to make my hands dirty on new fields in computer science which I didn’t touch before, wrtie blogs and tutorials, and take part in technology meetups and conferences actively. At least in my opinion, I become more mature and valuable accompanied with older age.

Ability must be ageless, and it is what I want to say.

The “invalid argument” error of cudaMemcpyAsync

Check following CUDA code:

#include <iostream>

#define cudaSafeCall(call)  \
        do {\
            cudaError_t err = call;\
            if (cudaSuccess != err) \
            {\
                std::cerr << "CUDA error in " << __FILE__ << "(" << __LINE__ << "): " \
                    << cudaGetErrorString(err) << std::endl;\
                exit(EXIT_FAILURE);\
            }\
        } while(0)

int main(void)
{
    char *a, *d_a;
    cudaStream_t st;
    cudaSafeCall(cudaStreamCreate(&st));
    cudaSafeCall(cudaMallocHost(&a, 10));
    cudaSafeCall(cudaMalloc(&d_a, 4));
    cudaSafeCall(cudaMemcpyAsync(a, d_a, 10, cudaMemcpyHostToDevice, st));
    return 0;
}

The d_a is allocated only 4 bytes, but we want to copy 10 bytes. In this case, cudaMemcpyAsync will complain invalid argument error:

$ nvcc test.cu
$ ./a.out
CUDA error in test.cu(21): invalid argument