这篇笔记摘自Professional CUDA C Programming:
CUDA provides two API levels for managing the GPU device and organizing threads:
➤ CUDA Driver API
➤ CUDA Runtime API
The driver API is a low-level API and is relatively hard to program, but it provides more control over how the GPU device is used. The runtime API is a higher-level API implemented on top of the driver API. Each function of the runtime API is broken down into more basic operations issued to the driver API.There is no noticeable performance difference between the runtime and driver APIs. How your kernels use memory and how you organize your threads on the device have a much more pronounced effect.
These two APIs are mutually exclusive. You must use one or the other, but it is not possible to mix function calls from both.