nvidia-smi doesn’t display GPU device in order

Today I find nvidia-smi program doesn’t display GPU devices in order:

$ nvidia-smi -L
GPU 0: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 1: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 2: Tesla P100-PCIE-16GB (UUID: GPU-...)
GPU 3: Tesla V100-PCIE-16GB (UUID: GPU-...)

The above output displays V100‘s device ID is 3. While CUDA-Z display V100‘s device ID is 0:

Capture

My own lscuda also display it as the 0th device:

$ ./lscuda
CUDA Runtime Version:             9.1
CUDA Driver Version:              9.1
GPU(s):                           4
GPU Device ID:                    0
  Name:                           Tesla V100-PCIE-16GB
......

 

2 thoughts on “nvidia-smi doesn’t display GPU device in order”

  1. That is deliberate. nvidia-smi uses the order in which GPUs are registered with the driver at boot time. CUDA on the other hand uses an order (for historic reasons) where ID 0 is the best compute GPU in the system. The two orderings can be different. If you want a more consistent order in CUDA, too, you can for example do “export CUDA_DEVICE_ORDER=PCI_BUS_ID” or use the device UUIDs to match between the nvidia-smi and cuda order. See here for more discussion: https://stackoverflow.com/questions/13781738/how-does-cuda-assign-device-ids-to-gpus

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.