nvidia-smi doesn’t display GPU device in order

Today I find nvidia-smi program doesn’t display GPU devices in order:

$ nvidia-smi -L
GPU 0: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 1: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 2: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 3: Tesla V100-PCIE-16GB (UUID: GPU-...)

The above output displays V100‘s device ID is 3. While CUDA-Z display V100‘s device ID is 0:

Capture

My own lscuda also display it as the 0th device:

$ ./lscuda
CUDA Runtime Version:             9.1
CUDA Driver Version:              9.1
GPU(s):                           4
GPU Device ID:                    0
  Name:                           Tesla V100-PCIE-16GB
......

 

Ability must be ageless

Yesterday I came across this impressive video: Ability is ageless, and want to share some thoughts here.

I don’t know about other countries, whereas in China, ageism does exist more or less in software companies. I have read some stories and news about elder engineers who were laid off without convinced reasons. Employers think the elder employees have families and want more work-life balance, so they won’t work over-time without complaint like fresh graduates. Furthermore, the elder engineers are harder to manage than freshmen. I even read a job description which was like this: we don’t welcome the applicants who are older than 30, since they will lack innovation.

Regarding myself, I am 35 years old. 10 years ago, in January, 2008, I left school and got my first full-time job. In fact, currently I don’t reduce my working time in a week compared with 10 years ago. The experience accumulated in past 10 years is literally very precious, and I can ulitize it to help other younger colleagues. Besides this, I don’t come to a standstill, and keep to make my hands dirty on new fields in computer science which I didn’t touch before, wrtie blogs and tutorials, and take part in technology meetups and conferences actively. At least in my opinion, I become more mature and valuable accompanied with older age.

Ability must be ageless, and it is what I want to say.

The “invalid argument” error of cudaMemcpyAsync

Check following CUDA code:

#include <iostream>

#define cudaSafeCall(call)  \
        do {\
            cudaError_t err = call;\
            if (cudaSuccess != err) \
            {\
                std::cerr << "CUDA error in " << __FILE__ << "(" << __LINE__ << "): " \
                    << cudaGetErrorString(err) << std::endl;\
                exit(EXIT_FAILURE);\
            }\
        } while(0)

int main(void)
{
    char *a, *d_a;
    cudaStream_t st;
    cudaSafeCall(cudaStreamCreate(&st));
    cudaSafeCall(cudaMallocHost(&a, 10));
    cudaSafeCall(cudaMalloc(&d_a, 4));
    cudaSafeCall(cudaMemcpyAsync(a, d_a, 10, cudaMemcpyHostToDevice, st));
    return 0;
}

The d_a is allocated only 4 bytes, but we want to copy 10 bytes. In this case, cudaMemcpyAsync will complain invalid argument error:

$ nvcc test.cu
$ ./a.out
CUDA error in test.cu(21): invalid argument