nvidia-smi doesn’t display GPU device in order

Today I find nvidia-smi program doesn’t display GPU devices in order:

$ nvidia-smi -L
GPU 0: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 1: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 2: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 3: Tesla V100-PCIE-16GB (UUID: GPU-...)

The above output displays V100‘s device ID is 3. While CUDA-Z display V100‘s device ID is 0:

Capture

My own lscuda also display it as the 0th device:

$ ./lscuda
CUDA Runtime Version:             9.1
CUDA Driver Version:              9.1
GPU(s):                           4
GPU Device ID:                    0
  Name:                           Tesla V100-PCIE-16GB
......

 

The “invalid argument” error of cudaMemcpyAsync

Check following CUDA code:

#include <iostream>

#define cudaSafeCall(call)  \
        do {\
            cudaError_t err = call;\
            if (cudaSuccess != err) \
            {\
                std::cerr << "CUDA error in " << __FILE__ << "(" << __LINE__ << "): " \
                    << cudaGetErrorString(err) << std::endl;\
                exit(EXIT_FAILURE);\
            }\
        } while(0)

int main(void)
{
    char *a, *d_a;
    cudaStream_t st;
    cudaSafeCall(cudaStreamCreate(&st));
    cudaSafeCall(cudaMallocHost(&a, 10));
    cudaSafeCall(cudaMalloc(&d_a, 4));
    cudaSafeCall(cudaMemcpyAsync(a, d_a, 10, cudaMemcpyHostToDevice, st));
    return 0;
}

The d_a is allocated only 4 bytes, but we want to copy 10 bytes. In this case, cudaMemcpyAsync will complain invalid argument error:

$ nvcc test.cu
$ ./a.out
CUDA error in test.cu(21): invalid argument

Be cautious of upper/lower case letters about function in Haskell

As a layman of Haskell, I find being cautious of upper/lower case letters help me a lot to get understanding of functions:

(1) Function name doesn’t begin upper case: it can be lower case (e.g., sqrt) or special characters (e.g., +).

(2) Function has type. Let’s define a new function, incInt:

incInt :: Integer -> Integer
incInt a = a + 1

The above identifies incInt‘s type is “Integer -> Integer“: Integer is type in Haskell, and types begin with upper case. Check another built-in functionsqrt:

Prelude> :t sqrt
sqrt :: Floating a => a -> a

The “Floating a” which is in the left of => is called type constraint: Floating is typeclass, and its values are types which satisfy this typeclass (Floatingtypeclass contains both Float and Double types); a is a type variable which can be any type which belongs to Floating typeclass.

Takeaway:
a) Function name can’t begin with upper case letters.
b) Type, typeclass, type variable occur in function type definition. Type and typeclass begin with upper case letters, and type variable need to begin with lower case letters.

Set up Haskell development environment on Arch Linux

Follow this post, install stack first:

# pacman -S stack
resolving dependencies...
looking for conflicting packages...
......

At the end of installation, it prompts following words:

You need to either 1) install latest stable ghc package from [extra] or 2) install ncurses5compat-libs from AUR for the prebuilt binaries installed by stack to work.

So install ghc further:

# pacman -S ghc
resolving dependencies...
looking for conflicting packages...

Now that the environment is ready, you should modify your own ~/.stack/config.yaml file:

# This file contains default non-project-specific settings for 'stack', used
# in all projects.  For more information about stack's configuration, see
# http://docs.haskellstack.org/en/stable/yaml_configuration/

# The following parameters are used by "stack new" to automatically fill fields
# in the cabal config. We recommend uncommenting them and filling them out if
# you intend to use 'stack new'.
# See https://docs.haskellstack.org/en/stable/yaml_configuration/#templates
templates:
  params:
#    author-name:
#    author-email:
#    copyright:
#    github-username:

For example, add name, email etc.

Then follow this link to create a simple program:

# stack new hello
# cd hello
# stack setup
# stack build
# stack exec hello-exe

You will see “someFunc” is outputted.

BTW, you can also use ghc compiler directly. E.g., write a “Hello world” program (Reference is here):

# cat hello.hs
main :: IO ()
main = putStrLn "Hello World!"
# ghc -dynamic hello.hs
# ./hello
Hello World!

That’s all!

Be careful of thread stack size

Today, my colleague came across a thread stack overflow core dump:

Capture

From above diagram, we can see only this function’s stack will occupy ~7 MiB space. Check the stack size configuration on system:

$ ulimit -S -s
8192

just 8 MiB. Double the stack size:

$ ulimit -S -s 16384

The program won’t crash.

Reference:
General: How do I change my default limits for stack size, core file size, etc.?.