imate
C++/CUDA Reference
CudaInterface< ArrayType > Class Template Reference

An interface to CUDA linrary to facilitate working with CUDA, such as memory allocation, copy data to and from device, etc. This class contains all public static functions and serves as a namespace. More...

#include <cuda_interface.h>

Static Public Member Functions

static ArrayType * alloc (const LongIndexType array_size)
 Allocates memory on gpu device. This function creates a pointer and returns it. More...
 
static void alloc (ArrayType *&device_array, const LongIndexType array_size)
 Allocates memory on gpu device. This function uses an existing given pointer. More...
 
static void alloc_bytes (void *&device_array, const size_t num_bytes)
 Allocates memory on gpu device. This function uses an existing given pointer. More...
 
static void copy_to_device (const ArrayType *host_array, const LongIndexType array_size, ArrayType *device_array)
 Copies memory on host to device memory. More...
 
static void del (void *device_array)
 Deletes memory on gpu device if its pointer is not NULL, then sets the pointer to NULL. More...
 
static void set_device (int device_id)
 Sets the current device in multi-gpu applications. More...
 
static int get_device ()
 Gets the current device in multi-gpu applications. More...
 

Detailed Description

template<typename ArrayType>
class CudaInterface< ArrayType >

An interface to CUDA linrary to facilitate working with CUDA, such as memory allocation, copy data to and from device, etc. This class contains all public static functions and serves as a namespace.

Definition at line 36 of file cuda_interface.h.

Member Function Documentation

◆ alloc() [1/2]

template<typename ArrayType >
void CudaInterface< ArrayType >::alloc ( ArrayType *&  device_array,
const LongIndexType  array_size 
)
static

Allocates memory on gpu device. This function uses an existing given pointer.

Parameters
[in,out]device_arrayA pointer to the device memory to be allocated
[in]array_sizeSize of the array to be allocated.

Definition at line 76 of file cuda_interface.cu.

79 {
80  // Check if overflowing might make array_size negative if LongIndexType is
81  // a signed type. For unsigned type, we have no clue at this point.
82  assert(array_size > 0);
83 
84  // Check if computing num_bytes will not overflow size_t (unsigned int)
85  size_t max_index = std::numeric_limits<size_t>::max();
86  if (max_index / sizeof(ArrayType) < array_size)
87  {
88  std::cerr << "The size of array in bytes exceeds the maximum " \
89  << "integer limit, which is: " << max_index << ". The " \
90  << "array size is: " << array_size << ", and the size of " \
91  << "data type is: " << sizeof(ArrayType) << "-bytes." \
92  << std::endl;
93  abort();
94  }
95 
96  size_t num_bytes = static_cast<size_t>(array_size) * sizeof(ArrayType);
97  cudaError_t error = cudaMalloc(&device_array, num_bytes);
98  assert(error == cudaSuccess);
99 }
cudaError_t cudaMalloc(void **devPtr, size_t size)
Definition of CUDA's cudaMalloc function using dynamically loaded cudart library.

References cudaMalloc().

Here is the call graph for this function:

◆ alloc() [2/2]

template<typename ArrayType >
ArrayType * CudaInterface< ArrayType >::alloc ( const LongIndexType  array_size)
static

Allocates memory on gpu device. This function creates a pointer and returns it.

Parameters
[in]array_sizeSize of the array to be allocated.
Returns
A pointer to the allocated 1D array on device.

Definition at line 36 of file cuda_interface.cu.

37 {
38  // Check if overflowing might make array_size negative if LongIndexType is
39  // a signed type. For unsigned type, we have no clue at this point.
40  assert(array_size > 0);
41 
42  // Check if computing num_bytes will not overflow size_t (unsigned int)
43  size_t max_index = std::numeric_limits<size_t>::max();
44  if (max_index / sizeof(ArrayType) < array_size)
45  {
46  std::cerr << "The size of array in bytes exceeds the maximum " \
47  << "integer limit, which is: " << max_index << ". The " \
48  << "array size is: " << array_size << ", and the size of " \
49  << "data type is: " << sizeof(ArrayType) << "-bytes." \
50  << std::endl;
51  abort();
52  }
53 
54  ArrayType* device_array;
55  size_t num_bytes = static_cast<size_t>(array_size) * sizeof(ArrayType);
56  cudaError_t error = cudaMalloc(&device_array, num_bytes);
57  assert(error == cudaSuccess);
58 
59  return device_array;
60 }

References cudaMalloc().

Referenced by cuCSCMatrix< DataType >::copy_host_to_device(), cuCSRMatrix< DataType >::copy_host_to_device(), cuDenseMatrix< DataType >::copy_host_to_device(), cu_golub_kahn_bidiagonalization(), and cu_lanczos_tridiagonalization().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ alloc_bytes()

template<typename ArrayType >
void CudaInterface< ArrayType >::alloc_bytes ( void *&  device_array,
const size_t  num_bytes 
)
static

Allocates memory on gpu device. This function uses an existing given pointer.

Parameters
[in,out]device_arrayA pointer to the device memory to be allocated
[in]num_bytesNumber of bytes of the array to be allocated.

Definition at line 115 of file cuda_interface.cu.

118 {
119  // Check if overflowing might make num_bytes negative if size_t is
120  // a signed type. For unsigned type, we have no clue at this point.
121  assert(num_bytes > 0);
122 
123  cudaError_t error = cudaMalloc(&device_array, num_bytes);
124  assert(error == cudaSuccess);
125 }

References cudaMalloc().

Referenced by cuCSCMatrix< DataType >::allocate_buffer(), and cuCSRMatrix< DataType >::allocate_buffer().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ copy_to_device()

template<typename ArrayType >
void CudaInterface< ArrayType >::copy_to_device ( const ArrayType *  host_array,
const LongIndexType  array_size,
ArrayType *  device_array 
)
static

Copies memory on host to device memory.

Parameters
[in]host_arrayPointer of 1D array memory on host
[in]array_sizeThe size of array on host.
[out]device_arrayPointer to the destination memory on device.

Definition at line 142 of file cuda_interface.cu.

146 {
147  size_t num_bytes = static_cast<size_t>(array_size) * sizeof(ArrayType);
148  cudaError_t error = cudaMemcpy(device_array, host_array, num_bytes,
149  cudaMemcpyHostToDevice);
150  assert(error == cudaSuccess);
151 }
cudaError_t cudaMemcpy(void *dst, const void *src, size_t count, cudaMemcpyKind kind)
Definition of CUDA's cudaMemcpy function using dynamically loaded cudart library.

References cudaMemcpy().

Referenced by cuCSCMatrix< DataType >::copy_host_to_device(), cuCSRMatrix< DataType >::copy_host_to_device(), cuDenseMatrix< DataType >::copy_host_to_device(), cu_golub_kahn_bidiagonalization(), cu_lanczos_tridiagonalization(), and cuOrthogonalization< DataType >::orthogonalize_vectors().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ del()

template<typename ArrayType >
void CudaInterface< ArrayType >::del ( void *  device_array)
static

Deletes memory on gpu device if its pointer is not NULL, then sets the pointer to NULL.

Parameters
[in,out]device_arrayA pointer to memory on device to be deleted. This pointer will be set to NULL.

Definition at line 166 of file cuda_interface.cu.

167 {
168  if (device_array != NULL)
169  {
170  cudaError_t error = cudaFree(device_array);
171  assert(error == cudaSuccess);
172  device_array = NULL;
173  }
174 }
cudaError_t cudaFree(void *devPtr)
Definition of CUDA's cudaFree function using dynamically loaded cudart library.

References cudaFree().

Referenced by cuCSCMatrix< DataType >::allocate_buffer(), cuCSRMatrix< DataType >::allocate_buffer(), cu_golub_kahn_bidiagonalization(), cu_lanczos_tridiagonalization(), cuCSCMatrix< DataType >::~cuCSCMatrix(), cuCSRMatrix< DataType >::~cuCSRMatrix(), and cuDenseMatrix< DataType >::~cuDenseMatrix().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ get_device()

template<typename ArrayType >
int CudaInterface< ArrayType >::get_device
static

Gets the current device in multi-gpu applications.

Returns
device_id The id of the current device. The id is a number from 0 to num_gpu_devices-1

Definition at line 206 of file cuda_interface.cu.

207 {
208  int device_id = -1;
209  cudaError_t error = cudaGetDevice(&device_id);
210  assert(error == cudaSuccess);
211 
212  return device_id;
213 }
cudaError_t cudaGetDevice(int *device)
Definition of CUDA's cudaGetDevice function using dynamically loaded cudart library.

References cudaGetDevice().

Referenced by cuAffineMatrixFunction< DataType >::_add_scaled_vector(), cuCSCMatrix< DataType >::dot(), cuCSRMatrix< DataType >::dot(), cuDenseMatrix< DataType >::dot(), cuCSCMatrix< DataType >::dot_plus(), cuCSRMatrix< DataType >::dot_plus(), cuDenseMatrix< DataType >::dot_plus(), cuLinearOperator< DataType >::get_cublas_handle(), cuCSCMatrix< DataType >::transpose_dot(), cuCSRMatrix< DataType >::transpose_dot(), cuDenseMatrix< DataType >::transpose_dot(), cuCSCMatrix< DataType >::transpose_dot_plus(), cuCSRMatrix< DataType >::transpose_dot_plus(), and cuDenseMatrix< DataType >::transpose_dot_plus().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ set_device()

template<typename ArrayType >
void CudaInterface< ArrayType >::set_device ( int  device_id)
static

Sets the current device in multi-gpu applications.

Parameters
[in]device_idThe id of the device to switch to. The id is a number from 0 to num_gpu_devices-1

Definition at line 188 of file cuda_interface.cu.

189 {
190  cudaError_t error = cudaSetDevice(device_id);
191  assert(error == cudaSuccess);
192 }
cudaError_t cudaSetDevice(int device)
Definition of CUDA's cudaSetDevice function using dynamically loaded cudart library.

References cudaSetDevice().

Referenced by cuCSCMatrix< DataType >::copy_host_to_device(), cuCSRMatrix< DataType >::copy_host_to_device(), cuDenseMatrix< DataType >::copy_host_to_device(), cuTraceEstimator< DataType >::cu_trace_estimator(), cuLinearOperator< DataType >::initialize_cublas_handle(), cuLinearOperator< DataType >::initialize_cusparse_handle(), cuCSCMatrix< DataType >::~cuCSCMatrix(), cuCSRMatrix< DataType >::~cuCSRMatrix(), cuDenseMatrix< DataType >::~cuDenseMatrix(), and cuLinearOperator< DataType >::~cuLinearOperator().

Here is the call graph for this function:
Here is the caller graph for this function:

The documentation for this class was generated from the following files: