imate
C++/CUDA Reference
Loading...
Searching...
No Matches
cuDenseMatrix< DataType > Class Template Reference

Container for dense matrices. More...

#include <cu_dense_matrix.h>

Inheritance diagram for cuDenseMatrix< DataType >:
Collaboration diagram for cuDenseMatrix< DataType >:

Public Member Functions

 cuDenseMatrix ()
 Default constructor.
 
 cuDenseMatrix (const DataType *A_, const LongIndexType num_rows_, const LongIndexType num_columns_, const FlagType A_is_row_major_, const FlagType A_is_symmetric_, const int num_gpu_devices_)
 Constructor.
 
virtual ~cuDenseMatrix ()
 Destructor. This function removes data from GPU devices.
 
virtual FlagType is_identity_matrix () const
 Checks whether the matrix is identity.
 
virtual void dot (const DataType *device_vector, DataType *device_product)
 Matrix vector product.
 
virtual void dot_plus (const DataType *device_vector, const DataType alpha, DataType *device_product)
 Matrix vector product written in place.
 
virtual void transpose_dot (const DataType *device_vector, DataType *device_product)
 Transposed-matrix vector product.
 
virtual void transpose_dot_plus (const DataType *device_vector, const DataType alpha, DataType *device_product)
 Transposed-matrix vector product written in place.
 
- Public Member Functions inherited from cuMatrix< DataType >
 cuMatrix ()
 Default constructor.
 
 cuMatrix (const FlagType A_is_symmetric_)
 Constructor.
 
virtual ~cuMatrix ()
 Destructor.
 
DataType get_eigenvalue (const DataType *known_parameters, const DataType known_eigenvalue, const DataType *inquiry_parameters) const
 This virtual function is implemented from its pure virtual function of the base class. In this class, this functio has no use and was only implemented so that this class be able to be instantiated (due to the pure virtual function).
 
virtual void set_symmetry (const FlagType symmetric)
 Specify whether the matrix is symmetic or non-symmetric.
 
- Public Member Functions inherited from cuLinearOperator< DataType >
 cuLinearOperator ()
 Default constructor.
 
 cuLinearOperator (const int num_gpu_devices_)
 Constructor with setting num_rows and num_columns.
 
virtual ~cuLinearOperator ()
 Destructor.
 
cublasHandle_t get_cublas_handle () const
 This function returns a reference to the cublasHandle_t object. The object will be created, if it is not created already.
 
void set_parameters (DataType *parameters_)
 Sets the scalar parameter this->parameters. Parameter is initialized to NULL. However, before calling dot or transpose_dot functions, the parameters must be set.
 
- Public Member Functions inherited from cLinearOperatorBase
 cLinearOperatorBase ()
 Default constructor.
 
 cLinearOperatorBase (const LongIndexType num_rows_, const LongIndexType num_columns_)
 Constructor with setting num_rows and num_columns.
 
virtual ~cLinearOperatorBase ()
 Destructor.
 
LongIndexType get_num_rows () const
 Returns the number of rows of the matrix.
 
LongIndexType get_num_columns () const
 Returns the number of columns of the matrix.
 
IndexType get_num_parameters () const
 Returns the number of parameters of the linear operator.
 
FlagType is_eigenvalue_relation_known () const
 Returns a flag that determines whether a relation between the parameters of the operator and its eigenvalue(s) is known.
 

Protected Member Functions

virtual void copy_host_to_device ()
 Copies the member data from the host memory to the device memory.
 
- Protected Member Functions inherited from cuLinearOperator< DataType >
int query_gpu_devices () const
 Before any numerical computation, this method chechs if any gpu device is available on the machine, or notifies the user if nothing was found.
 
void initialize_cublas_handle ()
 Creates a cublasHandle_t object, if not created already.
 
void initialize_cusparse_handle ()
 Creates a cusparseHandle_t object, if not created already.
 

Protected Attributes

DataType ** device_A
 
const DataType * A
 
const FlagType A_is_row_major
 
- Protected Attributes inherited from cuMatrix< DataType >
FlagType A_is_symmetric
 
- Protected Attributes inherited from cuLinearOperator< DataType >
int num_gpu_devices
 
bool copied_host_to_device
 
cublasHandle_t * cublas_handle
 
cusparseHandle_t * cusparse_handle
 
DataType * parameters
 
- Protected Attributes inherited from cLinearOperatorBase
const LongIndexType num_rows
 
const LongIndexType num_columns
 
FlagType eigenvalue_relation_known
 
IndexType num_parameters
 

Detailed Description

template<typename DataType>
class cuDenseMatrix< DataType >

Container for dense matrices.

The cuDenseMatrix holds a two-dimensional dense matrix, and can perofrom matrix-vector product and transposed matrix-vector product.

See also
cuMatrix, cuCSRMatrix, cuCSCMatrix, cuDenseAffineMatrixFunction, cDenseMatrix

Definition at line 43 of file cu_dense_matrix.h.

Constructor & Destructor Documentation

◆ cuDenseMatrix() [1/2]

template<typename DataType >
cuDenseMatrix< DataType >::cuDenseMatrix ( )

Default constructor.

Definition at line 40 of file cu_dense_matrix.cu.

40 :
41
42 // Initializer list
43 A(NULL),
44 device_A(NULL),
46{
47}
DataType ** device_A
const DataType * A
const FlagType A_is_row_major

◆ cuDenseMatrix() [2/2]

template<typename DataType >
cuDenseMatrix< DataType >::cuDenseMatrix ( const DataType *  A_,
const LongIndexType  num_rows_,
const LongIndexType  num_columns_,
const FlagType  A_is_row_major_,
const FlagType  A_is_symmetric_,
const int  num_gpu_devices_ 
)

Constructor.

Parameters
[in]A_1D array that represents a 2D dense array with either C (row) major ordering or Fortran (column) major ordering. The major ordering should de defined by A_is_row_major flag.
[in]num_rows_Number of rows of A
[in]num_columns_Number of columns of A
[in]A_is_row_major_Boolean, can be 0 or 1 as follows:
  • If A is row major (C ordering where the last index is contiguous) this value should be 1.
  • If A is column major (Fortran ordering where the first index is contiguous), this value should be set to 0.
[in]A_is_symmetric_Boolean. If A is symmetric, set this value to 1, otherwise 0.
[in]num_gpu_devices_Number of GPU devices to be utilized for parallelization.

Definition at line 77 of file cu_dense_matrix.cu.

83 :
84
85 // Base class constructor
86 cLinearOperatorBase(num_rows_, num_columns_),
87 cuLinearOperator<DataType>(num_gpu_devices_),
88 cuMatrix<DataType>(A_is_row_major_),
89
90 // Initializer list
91 A(A_),
92 device_A(NULL),
93 A_is_row_major(A_is_row_major_)
94{
96 this->copy_host_to_device();
97}
cLinearOperatorBase()
Default constructor.
virtual void copy_host_to_device()
Copies the member data from the host memory to the device memory.
Base class for linear operators. This class serves as interface for all derived classes.
void initialize_cublas_handle()
Creates a cublasHandle_t object, if not created already.
Base class for constant matrices.
Definition cu_matrix.h:45

References cuDenseMatrix< DataType >::copy_host_to_device(), and cuLinearOperator< DataType >::initialize_cublas_handle().

Here is the call graph for this function:

◆ ~cuDenseMatrix()

template<typename DataType >
cuDenseMatrix< DataType >::~cuDenseMatrix ( )
virtual

Destructor. This function removes data from GPU devices.

Definition at line 108 of file cu_dense_matrix.cu.

109{
110 // Member objects exist if the second constructor was called.
111 if (this->copied_host_to_device)
112 {
113 // Deallocate arrays of data on gpu
114 for (int device_id = 0; device_id < this->num_gpu_devices; ++device_id)
115 {
116 // Switch to a device
118
119 // Deallocate
120 CudaAPI<DataType>::del(this->device_A[device_id]);
121 }
122
123 delete[] this->device_A;
124 this->device_A = NULL;
125 }
126}
static void set_device(int device_id)
Sets the current device in multi-gpu applications.
Definition cuda_api.cu:191
static void del(void *device_array)
Deletes memory on gpu device if its pointer is not NULL, then sets the pointer to NULL.
Definition cuda_api.cu:169

References CudaAPI< ArrayType >::del(), and CudaAPI< ArrayType >::set_device().

Here is the call graph for this function:

Member Function Documentation

◆ copy_host_to_device()

template<typename DataType >
void cuDenseMatrix< DataType >::copy_host_to_device ( )
protectedvirtual

Copies the member data from the host memory to the device memory.

Implements cuMatrix< DataType >.

Definition at line 137 of file cu_dense_matrix.cu.

138{
139 if (!this->copied_host_to_device)
140 {
141 // Set the number of threads
142 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
144 #endif
145
146 // Create array of pointers for data on each gpu device
147 this->device_A = new DataType*[this->num_gpu_devices];
148
149 // Size of data
150 size_t A_size = static_cast<size_t>(this->num_rows) * \
151 static_cast<size_t>(this->num_columns);
152
153 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
154 #pragma omp parallel
155 #endif
156 {
157 // Switch to a device with the same device id as the cpu thread id
158 unsigned int thread_id;
159 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
160 thread_id = omp_get_thread_num();
161 #else
162 thread_id = 0;
163 #endif
164
166
167 // Allocate device memory and copy data from host
168 CudaAPI<DataType>::alloc(this->device_A[thread_id], A_size);
170 this->device_A[thread_id]);
171 }
172
173 // Flag to prevent reinitialization
174 this->copied_host_to_device = true;
175 }
176}
static ArrayType * alloc(const size_t array_size)
Allocates memory on gpu device. This function creates a pointer and returns it.
Definition cuda_api.cu:39
static void copy_to_device(const ArrayType *host_array, const size_t array_size, ArrayType *device_array)
Copies memory on host to device memory.
Definition cuda_api.cu:145
const LongIndexType num_rows
const LongIndexType num_columns
void omp_set_num_threads(int num_threads)
int omp_get_thread_num()

References CudaAPI< ArrayType >::alloc(), CudaAPI< ArrayType >::copy_to_device(), omp_get_thread_num(), omp_set_num_threads(), and CudaAPI< ArrayType >::set_device().

Referenced by cuDenseMatrix< DataType >::cuDenseMatrix().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ dot()

template<typename DataType >
void cuDenseMatrix< DataType >::dot ( const DataType *  device_vector,
DataType *  device_product 
)
virtual

Matrix vector product.

Performs the matrix vector product \( \boldsymbol{y} = \mathbf{A} \boldsymbol{x} \).

Parameters
[in]device_vectorA one-dimensional input vector \( \boldsymbol{x} \) with size the of the number of columns of the matrix \( \mathbf{A} \). This array should be on the GPU device.
[out]device_productA one-dimensional output vector \( \boldsymbol{y} \) with the size of the number of rows of \( \mathbf{A} \). This vector will be overwritten. This array should be on the GPU device.
See also
cuDenseMatrix::dot_plus, cuDenseMatrix::transposed_dot cuDenseMatrix::transposed_dot_plus

Implements cuLinearOperator< DataType >.

Definition at line 333 of file cu_dense_matrix.cu.

336{
337 assert(this->copied_host_to_device);
338
339 // Get device id
340 int device_id = CudaAPI<DataType>::get_device();
341
343 this->cublas_handle[device_id],
344 this->device_A[device_id],
345 device_vector,
346 this->num_rows,
347 this->num_columns,
348 this->A_is_row_major,
349 device_product);
350}
static int get_device()
Gets the current device in multi-gpu applications.
Definition cuda_api.cu:209
cublasHandle_t * cublas_handle
static void dense_matvec(cublasHandle_t cublas_handle, const DataType *RESTRICT A, const DataType *RESTRICT b, const LongIndexType num_rows, const LongIndexType num_columns, const FlagType A_is_row_major, DataType *RESTRICT c)
Computes the matrix vector multiplication where is a dense matrix.

References cuMatrixOperations< DataType >::dense_matvec(), and CudaAPI< ArrayType >::get_device().

Here is the call graph for this function:

◆ dot_plus()

template<typename DataType >
void cuDenseMatrix< DataType >::dot_plus ( const DataType *  device_vector,
const DataType  alpha,
DataType *  device_product 
)
virtual

Matrix vector product written in place.

Performs the matrix vector product \( \boldsymbol{y} = \boldsymbol{y} + \alpha \mathbf{A} \boldsymbol{x} \).

Parameters
[in]device_vectorA one-dimensional input vector \( \boldsymbol{x} \) with size the of the number of columns of the matrix \( \mathbf{A} \). This array should be on GPU device.
[in]alphaA scalar.
[out]device_productA one-dimensional output vector \( \boldsymbol{y} \) with the size of the number of rows of \( \mathbf{A} \). This array should be on GPU device.
See also
cuDenseMatrix::dot, cuDenseMatrix::transposed_dot cuDenseMatrix::transposed_dot_plus

Implements cuMatrix< DataType >.

Definition at line 378 of file cu_dense_matrix.cu.

382{
383 assert(this->copied_host_to_device);
384
385 // Get device id
386 int device_id = CudaAPI<DataType>::get_device();
387
389 this->cublas_handle[device_id],
390 this->device_A[device_id],
391 device_vector,
392 alpha,
393 this->num_rows,
394 this->num_columns,
395 this->A_is_row_major,
396 device_product);
397}
static void dense_matvec_plus(cublasHandle_t cublas_handle, const DataType *RESTRICT A, const DataType *RESTRICT b, const DataType alpha, const LongIndexType num_rows, const LongIndexType num_columns, const FlagType A_is_row_major, DataType *RESTRICT c)
Computes the operation where is a dense matrix.

References cuMatrixOperations< DataType >::dense_matvec_plus(), and CudaAPI< ArrayType >::get_device().

Here is the call graph for this function:

◆ is_identity_matrix()

template<typename DataType >
FlagType cuDenseMatrix< DataType >::is_identity_matrix ( ) const
virtual

Checks whether the matrix is identity.

The identity check is primarily performed in the cAffineMatrixFunction class.

Returns
Returns 1 if the input matrix is identity, and 0 otherwise.
See also
cAffineMatrixFunction

Implements cuMatrix< DataType >.

Definition at line 193 of file cu_dense_matrix.cu.

194{
195 FlagType matrix_is_identity = 1;
196 DataType matrix_element;
197 const DataType diagonal = 1.0;
198 const DataType off_diagonal = 0.0;
199
200 // Check matrix element-wise
201 if (this->A_is_row_major)
202 {
203 // Row-major matrix
204 LongIndexType column;
205 LongIndexType num_checking_columns;
206
207 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
208 #pragma omp parallel for \
209 schedule(static) \
210 if (!omp_in_parallel()) \
211 default(none) \
212 shared(matrix_is_identity, diagonal, off_diagonal) \
213 private(column, num_checking_columns, matrix_element)
214 #endif
215 for (LongIndexType row=0; row < this->num_rows; ++row)
216 {
217 if (matrix_is_identity)
218 {
219 if (this->A_is_symmetric)
220 {
221 // Check only half of the columns up to diagonal element
222 num_checking_columns = row + 1;
223 }
224 else
225 {
226 num_checking_columns = this->num_columns;
227 }
228
229 for (column=0; column < num_checking_columns; ++column)
230 {
231 // Get an element of the matrix
232 matrix_element = this->A[row * this->num_columns + column];
233
234 // Check the value of element with identity matrix
235 if (((row == column) && \
236 (!cu_arithmetics::is_equal(matrix_element,
237 diagonal))) || \
238 ((row != column) && \
239 (!cu_arithmetics::is_equal(matrix_element,
240 off_diagonal))))
241 {
242 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
243 #pragma omp atomic write
244 #endif
245 matrix_is_identity = 0;
246
247 break;
248 }
249 }
250 }
251 }
252 }
253 else
254 {
255 // Column-major matrix
256 LongIndexType row;
257 LongIndexType num_checking_rows;
258
259 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
260 #pragma omp parallel for \
261 schedule(static) \
262 if (!omp_in_parallel()) \
263 default(none) \
264 shared(matrix_is_identity, diagonal, off_diagonal) \
265 private(row, num_checking_rows, matrix_element)
266 #endif
267 for (LongIndexType column=0; column < this-> num_columns; ++column)
268 {
269 if (matrix_is_identity)
270 {
271 if (this->A_is_symmetric)
272 {
273 // Check only half of the rows up to diagonal element
274 num_checking_rows = column + 1;
275 }
276 else
277 {
278 num_checking_rows = this->num_rows;
279 }
280
281 for (row=0; row < num_checking_rows; ++row)
282 {
283 // Get an element of the matrix
284 matrix_element = this->A[column * this->num_rows + row];
285
286 // Check the value of element with identity matrix
287 if (((row == column) && \
288 (!cu_arithmetics::is_equal(matrix_element,
289 diagonal))) || \
290 ((row != column) && \
291 (!cu_arithmetics::is_equal(matrix_element,
292 off_diagonal))))
293 {
294 #if defined(USE_OPENMP) && (USE_OPENMP == 1)
295 #pragma omp atomic write
296 #endif
297 matrix_is_identity = 0;
298
299 break;
300 }
301 }
302 }
303 }
304 }
305
306 return matrix_is_identity;
307}
FlagType A_is_symmetric
Definition cu_matrix.h:79
bool is_equal(DataType x, DataType y)
Check if two floating point numbers are equal within a tolerance.
int LongIndexType
Definition types.h:60
int FlagType
Definition types.h:68

References cu_arithmetics::is_equal().

Here is the call graph for this function:

◆ transpose_dot()

template<typename DataType >
void cuDenseMatrix< DataType >::transpose_dot ( const DataType *  device_vector,
DataType *  device_product 
)
virtual

Transposed-matrix vector product.

Performs the matrix vector product \( \boldsymbol{y} = \mathbf{A}^{\intercal} \boldsymbol{x} \).

Parameters
[in]device_vectorA one-dimensional input vector \( \boldsymbol{x} \) with size the of the number of columns of the matrix \( \mathbf{A} \). This array should be in GPU device.
[out]device_productA one-dimensional output vector \( \boldsymbol{y} \) with the size of the number of rows of \( \mathbf{A} \). This vector will be overwritten. This array should be on GPU device.
See also
cuDenseMatrix::dot_plus, cuDenseMatrix::dot cuDenseMatrix::transposed_dot_plus

Implements cuLinearOperator< DataType >.

Definition at line 423 of file cu_dense_matrix.cu.

426{
427 assert(this->copied_host_to_device);
428
429 // Get device id
430 int device_id = CudaAPI<DataType>::get_device();
431
433 this->cublas_handle[device_id],
434 this->device_A[device_id],
435 device_vector,
436 this->num_rows,
437 this->num_columns,
438 this->A_is_row_major,
439 device_product);
440}
static void dense_transposed_matvec(cublasHandle_t cublas_handle, const DataType *RESTRICT A, const DataType *RESTRICT b, const LongIndexType num_rows, const LongIndexType num_columns, const FlagType A_is_row_major, DataType *RESTRICT c)
Computes matrix vector multiplication where is dense, and is the transpose of the matrix .

References cuMatrixOperations< DataType >::dense_transposed_matvec(), and CudaAPI< ArrayType >::get_device().

Here is the call graph for this function:

◆ transpose_dot_plus()

template<typename DataType >
void cuDenseMatrix< DataType >::transpose_dot_plus ( const DataType *  device_vector,
const DataType  alpha,
DataType *  device_product 
)
virtual

Transposed-matrix vector product written in place.

Performs the matrix vector product \( \boldsymbol{y} = \boldsymbol{y} + \alpha \mathbf{A}^{\intercal} \boldsymbol{x} \).

Parameters
[in]device_vectorA one-dimensional input vector \( \boldsymbol{x} \) with size the of the number of columns of the matrix \( \mathbf{A} \). This array should be on GPU device.
[in]alphaA scalar.
[out]device_productA one-dimensional output vector \( \boldsymbol{y} \) with the size of the number of rows of \( \mathbf{A} \). This array should be on GPU device.
See also
cuDenseMatrix::dot_plus, cuDenseMatrix::transposed_dot cuDenseMatrix::dot

Implements cuMatrix< DataType >.

Definition at line 469 of file cu_dense_matrix.cu.

473{
474 assert(this->copied_host_to_device);
475
476 // Get device id
477 int device_id = CudaAPI<DataType>::get_device();
478
480 this->cublas_handle[device_id],
481 this->device_A[device_id],
482 device_vector,
483 alpha,
484 this->num_rows,
485 this->num_columns,
486 this->A_is_row_major,
487 device_product);
488}
static void dense_transposed_matvec_plus(cublasHandle_t cublas_handle, const DataType *RESTRICT A, const DataType *RESTRICT b, const DataType alpha, const LongIndexType num_rows, const LongIndexType num_columns, const FlagType A_is_row_major, DataType *RESTRICT c)
Computes where is dense, and is the transpose of the matrix .

References cuMatrixOperations< DataType >::dense_transposed_matvec_plus(), and CudaAPI< ArrayType >::get_device().

Here is the call graph for this function:

Member Data Documentation

◆ A

template<typename DataType >
const DataType* cuDenseMatrix< DataType >::A
protected

Definition at line 87 of file cu_dense_matrix.h.

◆ A_is_row_major

template<typename DataType >
const FlagType cuDenseMatrix< DataType >::A_is_row_major
protected

Definition at line 88 of file cu_dense_matrix.h.

◆ device_A

template<typename DataType >
DataType** cuDenseMatrix< DataType >::device_A
protected

Definition at line 86 of file cu_dense_matrix.h.


The documentation for this class was generated from the following files: