imate
C++/CUDA Reference
Loading...
Searching...
No Matches
cuCSRMatrix< DataType > Class Template Reference

Container for CSR matrices. More...

#include <cu_csr_matrix.h>

Inheritance diagram for cuCSRMatrix< DataType >:
Collaboration diagram for cuCSRMatrix< DataType >:

Public Member Functions

 cuCSRMatrix ()
 Default constructor.
 
 cuCSRMatrix (const DataType *A_data_, const LongIndexType *A_indices_, const LongIndexType *A_index_pointer_, const LongIndexType num_rows_, const LongIndexType num_columns_, const FlagType A_is_symmetric_, const int num_gpu_devices_)
 Constructor.
 
virtual ~cuCSRMatrix ()
 Destructor.
 
virtual FlagType is_identity_matrix () const
 Checks whether the matrix is identity.
 
LongIndexType get_nnz () const
 Returns the number of non-zero elements of the sparse matrix.
 
virtual void dot (const DataType *device_vector, DataType *device_product)
 Matrix vector product.
 
virtual void dot_plus (const DataType *device_vector, const DataType alpha, DataType *device_product)
 Matrix vector product written in place.
 
virtual void transpose_dot (const DataType *device_vector, DataType *device_product)
 Transposed-matrix vector product.
 
virtual void transpose_dot_plus (const DataType *device_vector, const DataType alpha, DataType *device_product)
 Transposed-matrix vector product written in place.
 
- Public Member Functions inherited from cuMatrix< DataType >
 cuMatrix ()
 Default constructor.
 
 cuMatrix (const FlagType A_is_symmetric_)
 Constructor.
 
virtual ~cuMatrix ()
 Destructor.
 
DataType get_eigenvalue (const DataType *known_parameters, const DataType known_eigenvalue, const DataType *inquiry_parameters) const
 This virtual function is implemented from its pure virtual function of the base class. In this class, this functio has no use and was only implemented so that this class be able to be instantiated (due to the pure virtual function).
 
virtual void set_symmetry (const FlagType symmetric)
 Specify whether the matrix is symmetic or non-symmetric.
 
- Public Member Functions inherited from cuLinearOperator< DataType >
 cuLinearOperator ()
 Default constructor.
 
 cuLinearOperator (const int num_gpu_devices_)
 Constructor with setting num_rows and num_columns.
 
virtual ~cuLinearOperator ()
 Destructor.
 
cublasHandle_t get_cublas_handle () const
 This function returns a reference to the cublasHandle_t object. The object will be created, if it is not created already.
 
void set_parameters (DataType *parameters_)
 Sets the scalar parameter this->parameters. Parameter is initialized to NULL. However, before calling dot or transpose_dot functions, the parameters must be set.
 
- Public Member Functions inherited from cLinearOperatorBase
 cLinearOperatorBase ()
 Default constructor.
 
 cLinearOperatorBase (const LongIndexType num_rows_, const LongIndexType num_columns_)
 Constructor with setting num_rows and num_columns.
 
virtual ~cLinearOperatorBase ()
 Destructor.
 
LongIndexType get_num_rows () const
 Returns the number of rows of the matrix.
 
LongIndexType get_num_columns () const
 Returns the number of columns of the matrix.
 
IndexType get_num_parameters () const
 Returns the number of parameters of the linear operator.
 
FlagType is_eigenvalue_relation_known () const
 Returns a flag that determines whether a relation between the parameters of the operator and its eigenvalue(s) is known.
 

Protected Member Functions

virtual void copy_host_to_device ()
 Copies the member data from the host memory to the device memory.
 
void allocate_buffer (const int device_id, cusparseOperation_t cusparse_operation, const DataType alpha, const DataType beta, cusparseDnVecDescr_t &cusparse_input_vector, cusparseDnVecDescr_t &cusparse_output_vector, cusparseSpMVAlg_t algorithm)
 Allocates an external buffer for matrix-vector multiplication using cusparseSpMV function.
 
- Protected Member Functions inherited from cuLinearOperator< DataType >
int query_gpu_devices () const
 Before any numerical computation, this method chechs if any gpu device is available on the machine, or notifies the user if nothing was found.
 
void initialize_cublas_handle ()
 Creates a cublasHandle_t object, if not created already.
 
void initialize_cusparse_handle ()
 Creates a cusparseHandle_t object, if not created already.
 

Protected Attributes

const DataType * A_data
 
const LongIndexTypeA_indices
 
const LongIndexTypeA_index_pointer
 
DataType ** device_A_data
 
LongIndexType ** device_A_indices
 
LongIndexType ** device_A_index_pointer
 
void ** device_buffer
 
size_t * device_buffer_num_bytes
 
cusparseSpMatDescr_t * cusparse_matrix_A
 
- Protected Attributes inherited from cuMatrix< DataType >
FlagType A_is_symmetric
 
- Protected Attributes inherited from cuLinearOperator< DataType >
int num_gpu_devices
 
bool copied_host_to_device
 
cublasHandle_t * cublas_handle
 
cusparseHandle_t * cusparse_handle
 
DataType * parameters
 
- Protected Attributes inherited from cLinearOperatorBase
const LongIndexType num_rows
 
const LongIndexType num_columns
 
FlagType eigenvalue_relation_known
 
IndexType num_parameters
 

Detailed Description

template<typename DataType>
class cuCSRMatrix< DataType >

Container for CSR matrices.

The cCSRMatrix holds a two-dimensional compressed sparse row matrix, and can perofrom matrix-vector product and transposed matrix-vector product.

See also
cuMatrix, cuDenseMatrix, cuCSCMatrix, cuCSRAffineMatrixFunction, cCSRMatrix

Definition at line 44 of file cu_csr_matrix.h.

Constructor & Destructor Documentation

◆ cuCSRMatrix() [1/2]

template<typename DataType >
cuCSRMatrix< DataType >::cuCSRMatrix ( )

Default constructor.

Definition at line 41 of file cu_csr_matrix.cu.

41 :
42 A_data(NULL),
43 A_indices(NULL),
44 A_index_pointer(NULL),
45 device_A_data(NULL),
46 device_A_indices(NULL),
48 device_buffer(NULL),
51{
52}
size_t * device_buffer_num_bytes
const LongIndexType * A_index_pointer
cusparseSpMatDescr_t * cusparse_matrix_A
LongIndexType ** device_A_index_pointer
DataType ** device_A_data
const DataType * A_data
const LongIndexType * A_indices
LongIndexType ** device_A_indices
void ** device_buffer

◆ cuCSRMatrix() [2/2]

template<typename DataType >
cuCSRMatrix< DataType >::cuCSRMatrix ( const DataType *  A_data_,
const LongIndexType A_indices_,
const LongIndexType A_index_pointer_,
const LongIndexType  num_rows_,
const LongIndexType  num_columns_,
const FlagType  A_is_symmetric_,
const int  num_gpu_devices_ 
)

Constructor.

Parameters
[in]A_data_1D array of the data content of sparse matrix. The size of the array is the nnz of the matrix.
[in]A_indices_1D array indicating the column of each element in A_data_ . The size of this array is the nnz of the matrix.
[in]A_index_pointer_1D array pointing to the start of new rows in A_indices_ . The size of this array is num_rows+1 . The first element of this array is 0 and the last element of this array is the nnz of the matrix.
[in]num_rows_Number of rows of A
[in]num_columns_Number of columns of A
[in]A_is_symmetric_Boolean. If A is symmetric, set this value to 1, otherwise 0.
[in]num_gpu_devices_Number of GPU devices to be utilzied for parallel processing.

Definition at line 83 of file cu_csr_matrix.cu.

90 :
91
92 // Base class constructor
93 cLinearOperatorBase(num_rows_, num_columns_),
94 cuLinearOperator<DataType>(num_gpu_devices_),
95 cuMatrix<DataType>(A_is_symmetric_),
96
97 // Initializer list
98 A_data(A_data_),
99 A_indices(A_indices_),
100 A_index_pointer(A_index_pointer_),
101 device_A_data(NULL),
102 device_A_indices(NULL),
104 device_buffer(NULL),
106{
108 this->copy_host_to_device();
109
110 // Initialize device buffer
111 this->device_buffer = new void*[this->num_gpu_devices];
112 this->device_buffer_num_bytes = new size_t[this->num_gpu_devices];
113 for (int device_id=0; device_id < this->num_gpu_devices; ++device_id)
114 {
115 this->device_buffer[device_id] = NULL;
116 this->device_buffer_num_bytes[device_id] = 0;
117 }
118}
cLinearOperatorBase()
Default constructor.
virtual void copy_host_to_device()
Copies the member data from the host memory to the device memory.
Base class for linear operators. This class serves as interface for all derived classes.
void initialize_cusparse_handle()
Creates a cusparseHandle_t object, if not created already.
Base class for constant matrices.
Definition cu_matrix.h:45

References cuCSRMatrix< DataType >::copy_host_to_device(), cuCSRMatrix< DataType >::device_buffer, cuCSRMatrix< DataType >::device_buffer_num_bytes, cuLinearOperator< DataType >::initialize_cusparse_handle(), and cuLinearOperator< DataType >::num_gpu_devices.

Here is the call graph for this function:

◆ ~cuCSRMatrix()

template<typename DataType >
cuCSRMatrix< DataType >::~cuCSRMatrix ( )
virtual

Destructor.

Definition at line 129 of file cu_csr_matrix.cu.

130{
131 // Member objects exist if the second constructor was called.
132 if (this->copied_host_to_device)
133 {
134 // Deallocate arrays of data on gpu
135 for (int device_id=0; device_id < this->num_gpu_devices; ++device_id)
136 {
137 // Switch to a device
139
140 // Deallocate
141 CudaAPI<DataType>::del(this->device_A_data[device_id]);
143 this->device_A_indices[device_id]);
145 this->device_A_index_pointer[device_id]);
148 this->cusparse_matrix_A[device_id]);
149 }
150 }
151
152 // Deallocate arrays of pointers on cpu
153 if (this->device_A_data != NULL)
154 {
155 delete[] this->device_A_data;
156 this->device_A_data = NULL;
157 }
158
159 if (this->device_A_indices != NULL)
160 {
161 delete[] this->device_A_indices;
162 this->device_A_indices = NULL;
163 }
164
165 if (this->device_A_index_pointer != NULL)
166 {
167 delete[] this->device_A_index_pointer;
168 this->device_A_index_pointer = NULL;
169 }
170
171 if (this->device_buffer != NULL)
172 {
173 delete[] this->device_buffer;
174 this->device_buffer = NULL;
175 }
176
177 if (this->device_buffer_num_bytes != NULL)
178 {
179 delete[] this->device_buffer_num_bytes;
180 this->device_buffer_num_bytes = NULL;
181 }
182
183 if (this->cusparse_matrix_A != NULL)
184 {
185 delete[] this->cusparse_matrix_A;
186 this->cusparse_matrix_A = NULL;
187 }
188}
static void set_device(int device_id)
Sets the current device in multi-gpu applications.
Definition cuda_api.cu:191
static void del(void *device_array)
Deletes memory on gpu device if its pointer is not NULL, then sets the pointer to NULL.
Definition cuda_api.cu:169
void destroy_cusparse_matrix(cusparseSpMatDescr_t &cusparse_matrix)
Destroy cusparse matrix.

References CudaAPI< ArrayType >::del(), cusparse_api::destroy_cusparse_matrix(), and CudaAPI< ArrayType >::set_device().

Here is the call graph for this function:

Member Function Documentation

◆ allocate_buffer()

template<typename DataType >
void cuCSRMatrix< DataType >::allocate_buffer ( const int  device_id,
cusparseOperation_t  cusparse_operation,
const DataType  alpha,
const DataType  beta,
cusparseDnVecDescr_t &  cusparse_input_vector,
cusparseDnVecDescr_t &  cusparse_output_vector,
cusparseSpMVAlg_t  algorithm 
)
protected

Allocates an external buffer for matrix-vector multiplication using cusparseSpMV function.

If buffer size if not the same as required buffer size, allocate (or reallocate) memory. The allocation is always performed in the first call of this function since buffer size is initialized to zero in constructor. But for the next calls it might not be reallocated if the buffer size is the same.

Parameters
[in]device_idThe ID of the GPU device, from 0 to num_gpu_devices-1.
[in]cusparse_operationThe CuSparfse operation, which can be CUSPARSE_OPERATION_NON_TRANSPOSE or CUSPARSE_OPERATION_TRANSPOSE.
[in]alphaScalar. The parameter \( \alpha \) in matrix-vector multiplication.
[in]betaScalar. The parameter \( \beta \) in matrix-vector multiplication.
[in]cusparse_input_vectorInput vector in the matrix-vector multiplication.
[in]cusparse_output_vectorOutput vector in the matrix-vector multiplication.
[in]algorithmCuSparse algorithm for sparse matrix-vector product. Possible values can be CUSPARSE_SPMV_ALG_DEFAULT, CUSPARSE_SPMV_CSR_ALG1, CUSPARSE_SPMV_CSR_ALG2, etc.

Definition at line 306 of file cu_csr_matrix.cu.

314{
315 // Find the buffer size needed for matrix-vector multiplication
316 size_t required_buffer_size;
318 this->cusparse_handle[device_id], cusparse_operation, alpha,
319 this->cusparse_matrix_A[device_id], cusparse_input_vector, beta,
320 cusparse_output_vector, algorithm, &required_buffer_size);
321
322 if (this->device_buffer_num_bytes[device_id] != required_buffer_size)
323 {
324 // Update the buffer size
325 this->device_buffer_num_bytes[device_id] = required_buffer_size;
326
327 // Delete buffer if it was allocated previously
328 CudaAPI<DataType>::del(this->device_buffer[device_id]);
329
330 // Allocate (or reallocate) buffer on device.
332 this->device_buffer[device_id],
333 this->device_buffer_num_bytes[device_id]);
334 }
335}
static void alloc_bytes(void *&device_array, const size_t num_bytes)
Allocates memory on gpu device. This function uses an existing given pointer.
Definition cuda_api.cu:118
cusparseHandle_t * cusparse_handle
void cusparse_matrix_buffer_size(cusparseHandle_t cusparse_handle, cusparseOperation_t cusparse_operation, const DataType alpha, cusparseSpMatDescr_t cusparse_matrix, cusparseDnVecDescr_t cusparse_input_vector, const DataType beta, cusparseDnVecDescr_t cusparse_output_vector, cusparseSpMVAlg_t algorithm, size_t *buffer_size)

References CudaAPI< ArrayType >::alloc_bytes(), cusparse_api::cusparse_matrix_buffer_size(), and CudaAPI< ArrayType >::del().

Here is the call graph for this function:

◆ copy_host_to_device()

template<typename DataType >
void cuCSRMatrix< DataType >::copy_host_to_device ( )
protectedvirtual

Copies the member data from the host memory to the device memory.

Implements cuMatrix< DataType >.

Definition at line 199 of file cu_csr_matrix.cu.

200{
201 if (!this->copied_host_to_device)
202 {
203 // Set the number of threads
204 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
206 #endif
207
208 // Array sizes
209 LongIndexType A_data_size = this->get_nnz();
210 LongIndexType A_indices_size = A_data_size;
211 LongIndexType A_index_pointer_size = this->num_rows + 1;
212 LongIndexType A_nnz = this->get_nnz();
213
214 // Create array of pointers for data on each gpu device
215 this->device_A_data = new DataType*[this->num_gpu_devices];
217 this->device_A_index_pointer = \
218 new LongIndexType*[this->num_gpu_devices];
219 this->cusparse_matrix_A = \
220 new cusparseSpMatDescr_t[this->num_gpu_devices];
221
222 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
223 #pragma omp parallel
224 #endif
225 {
226 // Switch to a device with the same device id as the cpu thread id
227 unsigned int thread_id;
228 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
229 thread_id = omp_get_thread_num();
230 #else
231 thread_id = 0;
232 #endif
233
235
236 // A_data
238 A_data_size);
240 this->A_data, A_data_size, this->device_A_data[thread_id]);
241
242 // A_indices
244 this->device_A_indices[thread_id], A_indices_size);
246 this->A_indices, A_indices_size,
247 this->device_A_indices[thread_id]);
248
249 // A_index_pointer
251 this->device_A_index_pointer[thread_id],
252 A_index_pointer_size);
254 this->A_index_pointer, A_index_pointer_size,
255 this->device_A_index_pointer[thread_id]);
256
257 // Create cusparse matrix
259 this->cusparse_matrix_A[thread_id], this->num_rows,
260 this->num_columns, A_nnz, this->device_A_data[thread_id],
261 this->device_A_indices[thread_id],
262 this->device_A_index_pointer[thread_id]);
263 }
264
265 // Flag to prevent reinitialization
266 this->copied_host_to_device = true;
267 }
268}
static ArrayType * alloc(const size_t array_size)
Allocates memory on gpu device. This function creates a pointer and returns it.
Definition cuda_api.cu:39
static void copy_to_device(const ArrayType *host_array, const size_t array_size, ArrayType *device_array)
Copies memory on host to device memory.
Definition cuda_api.cu:145
const LongIndexType num_rows
const LongIndexType num_columns
LongIndexType get_nnz() const
Returns the number of non-zero elements of the sparse matrix.
void omp_set_num_threads(int num_threads)
int omp_get_thread_num()
void create_cusparse_csr_matrix(cusparseSpMatDescr_t &cusparse_matrix, const DataIndexType num_rows, const DataIndexType num_columns, const DataIndexType nnz, DataType *RESTRICT device_A_data, DataIndexType *RESTRICT device_A_indices, DataIndexType *RESTRICT device_A_index_pointer)
int LongIndexType
Definition types.h:60

References CudaAPI< ArrayType >::alloc(), CudaAPI< ArrayType >::copy_to_device(), cusparse_api::create_cusparse_csr_matrix(), omp_get_thread_num(), omp_set_num_threads(), and CudaAPI< ArrayType >::set_device().

Referenced by cuCSRMatrix< DataType >::cuCSRMatrix().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ dot()

template<typename DataType >
void cuCSRMatrix< DataType >::dot ( const DataType *  device_vector,
DataType *  device_product 
)
virtual

Matrix vector product.

Performs the matrix vector product \( \boldsymbol{y} = \mathbf{A} \boldsymbol{x} \).

Parameters
[in]device_vectorA one-dimensional input vector \( \boldsymbol{x} \) with size the of the number of columns of the matrix \( \mathbf{A} \). This array should be on GPU device.
[out]device_productA one-dimensional output vector \( \boldsymbol{y} \) with the size of the number of rows of \( \mathbf{A} \). This vector will be overwritten. This array should be on GPU device.
See also
cuCSRMatrix::dot_plus, cuCSRMatrix::transposed_dot cuCSRMatrix::transposed_dot_plus

Implements cuLinearOperator< DataType >.

Definition at line 449 of file cu_csr_matrix.cu.

452{
453 assert(this->copied_host_to_device);
454
455 // Create cusparse vector for the input vector
456 cusparseDnVecDescr_t cusparse_input_vector;
458 cusparse_input_vector, this->num_columns,
459 const_cast<DataType*>(device_vector));
460
461 // Create cusparse vector for the output vector
462 cusparseDnVecDescr_t cusparse_output_vector;
464 cusparse_output_vector, this->num_rows, device_product);
465
466 // Matrix vector settings
467 DataType alpha = cu_arithmetics::cast<float, DataType>(1.0f);
468 DataType beta = cu_arithmetics::cast<float, DataType>(0.0f);
469 cusparseOperation_t cusparse_operation = CUSPARSE_OPERATION_NON_TRANSPOSE;
470 cusparseSpMVAlg_t algorithm = CUSPARSE_SPMV_ALG_DEFAULT;
471
472 // Get device id
473 int device_id = CudaAPI<DataType>::get_device();
474
475 // Allocate device buffer (or reallocation if needed)
476 this->allocate_buffer(device_id, cusparse_operation, alpha, beta,
477 cusparse_input_vector, cusparse_output_vector,
478 algorithm);
479
480 // Matrix vector multiplication
482 this->cusparse_handle[device_id], cusparse_operation, alpha,
483 this->cusparse_matrix_A[device_id], cusparse_input_vector, beta,
484 cusparse_output_vector, algorithm, this->device_buffer[device_id]);
485
486 // Destroy cusparse vectors
487 cusparse_api::destroy_cusparse_vector(cusparse_input_vector);
488 cusparse_api::destroy_cusparse_vector(cusparse_output_vector);
489}
static int get_device()
Gets the current device in multi-gpu applications.
Definition cuda_api.cu:209
void allocate_buffer(const int device_id, cusparseOperation_t cusparse_operation, const DataType alpha, const DataType beta, cusparseDnVecDescr_t &cusparse_input_vector, cusparseDnVecDescr_t &cusparse_output_vector, cusparseSpMVAlg_t algorithm)
Allocates an external buffer for matrix-vector multiplication using cusparseSpMV function.
#define CUSPARSE_SPMV_ALG_DEFAULT
__host__ __device__ DataType abs(const DataType x)
Absolute value of a floating point number.
void cusparse_matvec(cusparseHandle_t cusparse_handle, cusparseOperation_t cusparse_operation, const DataType alpha, cusparseSpMatDescr_t cusparse_matrix, cusparseDnVecDescr_t cusparse_input_vector, const DataType beta, cusparseDnVecDescr_t cusparse_output_vector, cusparseSpMVAlg_t algorithm, void *external_buffer)
void create_cusparse_vector(cusparseDnVecDescr_t &cusparse_vector, const LongIndexType vector_size, DataType *RESTRICT device_vector)
void destroy_cusparse_vector(cusparseDnVecDescr_t &cusparse_vector)
Destroys cusparse vector.

References cu_arithmetics::abs(), cusparse_api::create_cusparse_vector(), cusparse_api::cusparse_matvec(), CUSPARSE_SPMV_ALG_DEFAULT, cusparse_api::destroy_cusparse_vector(), and CudaAPI< ArrayType >::get_device().

Here is the call graph for this function:

◆ dot_plus()

template<typename DataType >
void cuCSRMatrix< DataType >::dot_plus ( const DataType *  device_vector,
const DataType  alpha,
DataType *  device_product 
)
virtual

Matrix vector product written in place.

Performs the matrix vector product \( \boldsymbol{y} = \boldsymbol{y} + \alpha \mathbf{A} \boldsymbol{x} \).

Parameters
[in]device_vectorA one-dimensional input vector \( \boldsymbol{x} \) with size the of the number of columns of the matrix \( \mathbf{A} \). This array should be on GPU device.
[in]alphaA scalar.
[out]device_productA one-dimensional output vector \( \boldsymbol{y} \) with the size of the number of rows of \( \mathbf{A} \). This array should be on GPU device.
See also
cuCSRMatrix::dot, cuCSRMatrix::transposed_dot cuCSRMatrix::transposed_dot_plus

Implements cuMatrix< DataType >.

Definition at line 517 of file cu_csr_matrix.cu.

521{
522 assert(this->copied_host_to_device);
523
524 // Create cusparse vector for the input vector
525 cusparseDnVecDescr_t cusparse_input_vector;
527 cusparse_input_vector, this->num_columns,
528 const_cast<DataType*>(device_vector));
529
530 // Create cusparse vector for the output vector
531 cusparseDnVecDescr_t cusparse_output_vector;
533 cusparse_output_vector, this->num_rows, device_product);
534
535 // Matrix vector settings
536 DataType beta = cu_arithmetics::cast<float, DataType>(1.0f);
537 cusparseOperation_t cusparse_operation = CUSPARSE_OPERATION_NON_TRANSPOSE;
538 cusparseSpMVAlg_t algorithm = CUSPARSE_SPMV_ALG_DEFAULT;
539
540 // Get device id
541 int device_id = CudaAPI<DataType>::get_device();
542
543 // Allocate device buffer (or reallocation if needed)
544 this->allocate_buffer(device_id, cusparse_operation, alpha, beta,
545 cusparse_input_vector, cusparse_output_vector,
546 algorithm);
547
548 // Matrix vector multiplication
550 this->cusparse_handle[device_id], cusparse_operation, alpha,
551 this->cusparse_matrix_A[device_id], cusparse_input_vector, beta,
552 cusparse_output_vector, algorithm, this->device_buffer[device_id]);
553
554 // Destroy cusparse vectors
555 cusparse_api::destroy_cusparse_vector(cusparse_input_vector);
556 cusparse_api::destroy_cusparse_vector(cusparse_output_vector);
557}

References cu_arithmetics::abs(), cusparse_api::create_cusparse_vector(), cusparse_api::cusparse_matvec(), CUSPARSE_SPMV_ALG_DEFAULT, cusparse_api::destroy_cusparse_vector(), and CudaAPI< ArrayType >::get_device().

Here is the call graph for this function:

◆ get_nnz()

template<typename DataType >
LongIndexType cuCSRMatrix< DataType >::get_nnz ( ) const

Returns the number of non-zero elements of the sparse matrix.

The nnz of a CSR matrix can be obtained from the last element of A_index_pointer. The size of array A_index_pointer is one plus the number of rows of the matrix.

Returns
The nnz of the matrix.

Definition at line 420 of file cu_csr_matrix.cu.

421{
422 return this->A_index_pointer[this->num_rows];
423}

◆ is_identity_matrix()

template<typename DataType >
FlagType cuCSRMatrix< DataType >::is_identity_matrix ( ) const
virtual

Checks whether the matrix is identity.

The identity check is primarily performed in the cAffineMatrixFunction class.

Returns
Returns 1 if the input matrix is identity, and 0 otherwise.
See also
cAffineMatrixFunction

Implements cuMatrix< DataType >.

Definition at line 352 of file cu_csr_matrix.cu.

353{
354 FlagType matrix_is_identity = 1;
355 LongIndexType index_pointer;
356 LongIndexType column;
357 DataType matrix_element;
358 const DataType diagonal = 1.0;
359 const DataType off_diagonal = 0.0;
360
361 // Check matrix element-wise
362 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
363 #pragma omp parallel for \
364 schedule(static) \
365 if (!omp_in_parallel()) \
366 default(none) \
367 shared(matrix_is_identity, diagonal, off_diagonal) \
368 private(index_pointer, column, matrix_element)
369 #endif
370 for (LongIndexType row=0; row < this->num_rows; ++row)
371 {
372 if (matrix_is_identity)
373 {
374 for (index_pointer=this->A_index_pointer[row];
375 index_pointer < this->A_index_pointer[row+1];
376 ++index_pointer)
377 {
378 column = this->A_indices[index_pointer];
379
380 if (!((this->A_is_symmetric) && (column >= row)))
381 {
382 matrix_element = this->A_data[index_pointer];
383
384 if (((row == column) && \
385 (!cu_arithmetics::is_equal(matrix_element,
386 diagonal))) || \
387 ((row != column) && \
388 (!cu_arithmetics::is_equal(matrix_element,
389 off_diagonal))))
390 {
391 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
392 #pragma omp atomic write
393 #endif
394 matrix_is_identity = 0;
395
396 break;
397 }
398 }
399 }
400 }
401 }
402
403 return matrix_is_identity;
404}
FlagType A_is_symmetric
Definition cu_matrix.h:79
bool is_equal(DataType x, DataType y)
Check if two floating point numbers are equal within a tolerance.
int FlagType
Definition types.h:68

References cu_arithmetics::is_equal().

Here is the call graph for this function:

◆ transpose_dot()

template<typename DataType >
void cuCSRMatrix< DataType >::transpose_dot ( const DataType *  device_vector,
DataType *  device_product 
)
virtual

Transposed-matrix vector product.

Performs the matrix vector product \( \boldsymbol{y} = \mathbf{A}^{\intercal} \boldsymbol{x} \).

Parameters
[in]device_vectorA one-dimensional input vector \( \boldsymbol{x} \) with size the of the number of columns of the matrix \( \mathbf{A} \). This array should be on GPU device.
[out]device_productA one-dimensional output vector \( \boldsymbol{y} \) with the size of the number of rows of \( \mathbf{A} \). This vector will be overwritten. This array should be on GPU device.
See also
cuCSRMatrix::dot_plus, cuCSRMatrix::dot cuCSRMatrix::transposed_dot_plus

Implements cuLinearOperator< DataType >.

Definition at line 583 of file cu_csr_matrix.cu.

586{
587 assert(this->copied_host_to_device);
588
589 // Create cusparse vector for the input vector
590 cusparseDnVecDescr_t cusparse_input_vector;
592 cusparse_input_vector, this->num_columns,
593 const_cast<DataType*>(device_vector));
594
595 // Create cusparse vector for the output vector
596 cusparseDnVecDescr_t cusparse_output_vector;
598 cusparse_output_vector, this->num_rows, device_product);
599
600 // Matrix vector settings
601 DataType alpha = cu_arithmetics::cast<float, DataType>(1.0f);
602 DataType beta = cu_arithmetics::cast<float, DataType>(0.0f);
603 cusparseOperation_t cusparse_operation = CUSPARSE_OPERATION_TRANSPOSE;
604 cusparseSpMVAlg_t algorithm = CUSPARSE_SPMV_ALG_DEFAULT;
605
606 // Get device id
607 int device_id = CudaAPI<DataType>::get_device();
608
609 // Allocate device buffer (or reallocation if needed)
610 this->allocate_buffer(device_id, cusparse_operation, alpha, beta,
611 cusparse_input_vector, cusparse_output_vector,
612 algorithm);
613
614 // Matrix vector multiplication
616 this->cusparse_handle[device_id], cusparse_operation, alpha,
617 this->cusparse_matrix_A[device_id], cusparse_input_vector, beta,
618 cusparse_output_vector, algorithm, this->device_buffer[device_id]);
619
620 // Destroy cusparse vectors
621 cusparse_api::destroy_cusparse_vector(cusparse_input_vector);
622 cusparse_api::destroy_cusparse_vector(cusparse_output_vector);
623}

References cu_arithmetics::abs(), cusparse_api::create_cusparse_vector(), cusparse_api::cusparse_matvec(), CUSPARSE_SPMV_ALG_DEFAULT, cusparse_api::destroy_cusparse_vector(), and CudaAPI< ArrayType >::get_device().

Here is the call graph for this function:

◆ transpose_dot_plus()

template<typename DataType >
void cuCSRMatrix< DataType >::transpose_dot_plus ( const DataType *  device_vector,
const DataType  alpha,
DataType *  device_product 
)
virtual

Transposed-matrix vector product written in place.

Performs the matrix vector product \( \boldsymbol{y} = \boldsymbol{y} + \alpha \mathbf{A}^{\intercal} \boldsymbol{x} \).

Parameters
[in]device_vectorA one-dimensional input vector \( \boldsymbol{x} \) with size the of the number of columns of the matrix \( \mathbf{A} \). This array should be on GPU device.
[in]alphaA scalar.
[out]device_productA one-dimensional output vector \( \boldsymbol{y} \) with the size of the number of rows of \( \mathbf{A} \). This array should be on GPU device.
See also
cuCSRMatrix::dot_plus, cuCSRMatrix::transposed_dot cuCSRMatrix::dot

Implements cuMatrix< DataType >.

Definition at line 652 of file cu_csr_matrix.cu.

656{
657 assert(this->copied_host_to_device);
658
659 // Create cusparse vector for the input vector
660 cusparseDnVecDescr_t cusparse_input_vector;
662 cusparse_input_vector, this->num_columns,
663 const_cast<DataType*>(device_vector));
664
665 // Create cusparse vector for the output vector
666 cusparseDnVecDescr_t cusparse_output_vector;
668 cusparse_output_vector, this->num_rows, device_product);
669
670 // Matrix vector settings
671 DataType beta = cu_arithmetics::cast<float, DataType>(1.0f);
672 cusparseOperation_t cusparse_operation = CUSPARSE_OPERATION_TRANSPOSE;
673 cusparseSpMVAlg_t algorithm = CUSPARSE_SPMV_ALG_DEFAULT;
674
675 // Get device id
676 int device_id = CudaAPI<DataType>::get_device();
677
678 // Allocate device buffer (or reallocation if needed)
679 this->allocate_buffer(device_id, cusparse_operation, alpha, beta,
680 cusparse_input_vector, cusparse_output_vector,
681 algorithm);
682
683 // Matrix vector multiplication
685 this->cusparse_handle[device_id], cusparse_operation, alpha,
686 this->cusparse_matrix_A[device_id], cusparse_input_vector, beta,
687 cusparse_output_vector, algorithm, this->device_buffer[device_id]);
688
689 // Destroy cusparse vectors
690 cusparse_api::destroy_cusparse_vector(cusparse_input_vector);
691 cusparse_api::destroy_cusparse_vector(cusparse_output_vector);
692}

References cu_arithmetics::abs(), cusparse_api::create_cusparse_vector(), cusparse_api::cusparse_matvec(), CUSPARSE_SPMV_ALG_DEFAULT, cusparse_api::destroy_cusparse_vector(), and CudaAPI< ArrayType >::get_device().

Here is the call graph for this function:

Member Data Documentation

◆ A_data

template<typename DataType >
const DataType* cuCSRMatrix< DataType >::A_data
protected

Definition at line 99 of file cu_csr_matrix.h.

◆ A_index_pointer

template<typename DataType >
const LongIndexType* cuCSRMatrix< DataType >::A_index_pointer
protected

Definition at line 101 of file cu_csr_matrix.h.

◆ A_indices

template<typename DataType >
const LongIndexType* cuCSRMatrix< DataType >::A_indices
protected

Definition at line 100 of file cu_csr_matrix.h.

◆ cusparse_matrix_A

template<typename DataType >
cusparseSpMatDescr_t* cuCSRMatrix< DataType >::cusparse_matrix_A
protected

Definition at line 107 of file cu_csr_matrix.h.

◆ device_A_data

template<typename DataType >
DataType** cuCSRMatrix< DataType >::device_A_data
protected

Definition at line 102 of file cu_csr_matrix.h.

◆ device_A_index_pointer

template<typename DataType >
LongIndexType** cuCSRMatrix< DataType >::device_A_index_pointer
protected

Definition at line 104 of file cu_csr_matrix.h.

◆ device_A_indices

template<typename DataType >
LongIndexType** cuCSRMatrix< DataType >::device_A_indices
protected

Definition at line 103 of file cu_csr_matrix.h.

◆ device_buffer

template<typename DataType >
void** cuCSRMatrix< DataType >::device_buffer
protected

Definition at line 105 of file cu_csr_matrix.h.

Referenced by cuCSRMatrix< DataType >::cuCSRMatrix().

◆ device_buffer_num_bytes

template<typename DataType >
size_t* cuCSRMatrix< DataType >::device_buffer_num_bytes
protected

Definition at line 106 of file cu_csr_matrix.h.

Referenced by cuCSRMatrix< DataType >::cuCSRMatrix().


The documentation for this class was generated from the following files: