imate
C++/CUDA Reference
cVectorOperations< DataType > Class Template Reference

A static class for vector operations, similar to level-1 operations of the BLAS library. This class acts as a templated namespace, where all member methods are public and static. More...

#include <c_vector_operations.h>

Static Public Member Functions

static void copy_vector (const DataType *input_vector, const LongIndexType vector_size, DataType *output_vector)
 Copies a vector to a new vector. Result is written in-place. More...
 
static void copy_scaled_vector (const DataType *input_vector, const LongIndexType vector_size, const DataType scale, DataType *output_vector)
 Scales a vector and stores to a new vector. More...
 
static void subtract_scaled_vector (const DataType *input_vector, const LongIndexType vector_size, const DataType scale, DataType *output_vector)
 Subtracts the scaled input vector from the output vector. More...
 
static DataType inner_product (const DataType *vector1, const DataType *vector2, const LongIndexType vector_size)
 Computes Euclidean inner product of two vectors. More...
 
static DataType euclidean_norm (const DataType *vector, const LongIndexType vector_size)
 Computes the Euclidean norm of a 1D array. More...
 
static DataType normalize_vector_in_place (DataType *vector, const LongIndexType vector_size)
 Normalizes a vector based on Euclidean 2-norm. The result is written in-place. More...
 
static DataType normalize_vector_and_copy (const DataType *vector, const LongIndexType vector_size, DataType *output_vector)
 Normalizes a vector based on Euclidean 2-norm. The result is written into another vector. More...
 

Detailed Description

template<typename DataType>
class cVectorOperations< DataType >

A static class for vector operations, similar to level-1 operations of the BLAS library. This class acts as a templated namespace, where all member methods are public and static.

See also
MatrixOperations

Definition at line 35 of file c_vector_operations.h.

Member Function Documentation

◆ copy_scaled_vector()

template<typename DataType >
void cVectorOperations< DataType >::copy_scaled_vector ( const DataType *  input_vector,
const LongIndexType  vector_size,
const DataType  scale,
DataType *  output_vector 
)
static

Scales a vector and stores to a new vector.

Parameters
[in]input_vectorA 1D array
[in]vector_sizeLength of vector array
[in]scaleScale coefficient to the input vector. If this is equal to one, the function effectively becomes the same as copy_vector.
[out]output_vectorOutput vector (written in place).

Definition at line 81 of file c_vector_operations.cpp.

86 {
87  #if (USE_CBLAS == 1)
88 
89  // Using OpenBlas
90  int incx = 1;
91  int incy = 1;
92 
93  cblas_interface::xcopy(vector_size, input_vector, incx, output_vector,
94  incy);
95 
96  cblas_interface::xscal(vector_size, scale, output_vector, incy);
97 
98  #else
99 
100  // Not using OpenBlas
101  for (LongIndexType i=0; i < vector_size; ++i)
102  {
103  output_vector[i] = scale * input_vector[i];
104  }
105 
106  #endif
107 }
int LongIndexType
Definition: types.h:60

Referenced by c_lanczos_tridiagonalization(), and cVectorOperations< DataType >::normalize_vector_and_copy().

Here is the caller graph for this function:

◆ copy_vector()

template<typename DataType >
void cVectorOperations< DataType >::copy_vector ( const DataType *  input_vector,
const LongIndexType  vector_size,
DataType *  output_vector 
)
static

Copies a vector to a new vector. Result is written in-place.

Parameters
[in]input_vectorA 1D array
[in]vector_sizeLength of vector array
[out]output_vectorOutput vector (written in place).

Definition at line 39 of file c_vector_operations.cpp.

43 {
44  #if (USE_CBLAS == 1)
45 
46  // Using Openblas
47  int incx = 1;
48  int incy = 1;
49 
50  cblas_interface::xcopy(vector_size, input_vector, incx, output_vector,
51  incy);
52 
53  #else
54 
55  // Not using OpenBlas
56  for (LongIndexType i=0; i < vector_size; ++i)
57  {
58  output_vector[i] = input_vector[i];
59  }
60 
61  #endif
62 }

Referenced by c_lanczos_tridiagonalization().

Here is the caller graph for this function:

◆ euclidean_norm()

template<typename DataType >
DataType cVectorOperations< DataType >::euclidean_norm ( const DataType *  vector,
const LongIndexType  vector_size 
)
static

Computes the Euclidean norm of a 1D array.

The reduction variable (here, inner_prod ) is of the type long double. This is becase when DataType is float, the summation loses the precision, especially when the vector size is large. It seems that using long double is slightly faster than using double. The advantage of using a type with larger bits for the reduction variable is only sensible if the compiler is optimized with -O2 or -O3 flags.

Using a larger bit type for the reduction variable is very important for this function. If DataType is float, without such consideration, the result of estimation of trace can be completely wrong, just becase of the wrong norm results. For large array sizes, even libraries such as openblas does not compute the dot product accurately.

The chunk computation of the dot product (as seen in the code with chunk=5) improves the preformance with gaining twice speedup. This result is not much dependet on chunk. For example, chunk=10 also yields a similar result.

Parameters
[in]vectorA pointer to 1D array
[in]vector_sizeLength of the array
Returns
Euclidean norm

Definition at line 281 of file c_vector_operations.cpp.

284 {
285  #if (USE_CBLAS == 1)
286 
287  // Using OpenBlas
288  int incx = 1;
289 
290  DataType norm = cblas_interface::xnrm2(vector_size, vector, incx);
291 
292  return norm;
293 
294  #else
295 
296  // Compute norm squared
297  long double norm2 = 0.0;
298  LongIndexType chunk = 5;
299  LongIndexType vector_size_chunked = vector_size - (vector_size % chunk);
300 
301  for (LongIndexType i=0; i < vector_size_chunked; i += chunk)
302  {
303  norm2 += vector[i] * vector[i] +
304  vector[i+1] * vector[i+1] +
305  vector[i+2] * vector[i+2] +
306  vector[i+3] * vector[i+3] +
307  vector[i+4] * vector[i+4];
308  }
309 
310  for (LongIndexType i=vector_size_chunked; i < vector_size; ++i)
311  {
312  norm2 += vector[i] * vector[i];
313  }
314 
315  // Norm
316  DataType norm = sqrt(static_cast<DataType>(norm2));
317 
318  return norm;
319 
320  #endif
321 }

Referenced by c_lanczos_tridiagonalization(), cOrthogonalization< DataType >::gram_schmidt_process(), cVectorOperations< DataType >::normalize_vector_and_copy(), cVectorOperations< DataType >::normalize_vector_in_place(), and cOrthogonalization< DataType >::orthogonalize_vectors().

Here is the caller graph for this function:

◆ inner_product()

template<typename DataType >
DataType cVectorOperations< DataType >::inner_product ( const DataType *  vector1,
const DataType *  vector2,
const LongIndexType  vector_size 
)
static

Computes Euclidean inner product of two vectors.

The reduction variable (here, inner_prod ) is of the type long double. This is becase when DataType is float, the summation loses the precision, especially when the vector size is large. It seems that using long double is slightly faster than using double. The advantage of using a type with larger bits for the reduction variable is only sensible if the compiler is optimized with -O2 or -O3 flags.

Using a larger bit type for the reduction variable is very important for this function. If DataType is float, without such consideration, the result of estimation of trace can be completely wrong, just becase of the wrong inner product results. For large array sizes, even libraries such as openblas does not compute the dot product accurately.

The chunk computation of the dot product (as seen in the code with chunk=5) improves the preformance with gaining twice speedup. This result is not much dependet on chunk. For example, chunk=10 also yields a similar result.

Parameters
[in]vector11D array
[in]vector21D array
[in]vector_sizeLength of array
Returns
Inner product of two vectors.

Definition at line 204 of file c_vector_operations.cpp.

208 {
209  #if (USE_CBLAS == 1)
210 
211  // Using OpenBlas
212  int incx = 1;
213  int incy = 1;
214 
215  DataType inner_prod = cblas_interface::xdot(vector_size, vector1, incx,
216  vector2, incy);
217 
218  return inner_prod;
219 
220  #else
221 
222  // Not using OpenBlas
223  long double inner_prod = 0.0;
224  LongIndexType chunk = 5;
225  LongIndexType vector_size_chunked = vector_size - (vector_size % chunk);
226 
227  for (LongIndexType i=0; i < vector_size_chunked; i += chunk)
228  {
229  inner_prod += vector1[i] * vector2[i] +
230  vector1[i+1] * vector2[i+1] +
231  vector1[i+2] * vector2[i+2] +
232  vector1[i+3] * vector2[i+3] +
233  vector1[i+4] * vector2[i+4];
234  }
235 
236  for (LongIndexType i=vector_size_chunked; i < vector_size; ++i)
237  {
238  inner_prod += vector1[i] * vector2[i];
239  }
240 
241  return static_cast<DataType>(inner_prod);
242 
243  #endif
244 }

Referenced by c_lanczos_tridiagonalization(), cOrthogonalization< DataType >::gram_schmidt_process(), and cOrthogonalization< DataType >::orthogonalize_vectors().

Here is the caller graph for this function:

◆ normalize_vector_and_copy()

template<typename DataType >
DataType cVectorOperations< DataType >::normalize_vector_and_copy ( const DataType *  vector,
const LongIndexType  vector_size,
DataType *  output_vector 
)
static

Normalizes a vector based on Euclidean 2-norm. The result is written into another vector.

Parameters
[in]vectorInput vector.
[in]vector_sizeLength of the input vector
[out]output_vectorOutput vector, which is the normalization of the input vector.
Returns
2-norm of the input vector

Definition at line 389 of file c_vector_operations.cpp.

393 {
394  #if (USE_CBLAS == 1)
395 
396  // Norm of vector
398  vector, vector_size);
399 
400  // Normalize to output
401  DataType scale = 1.0 / norm;
402  cVectorOperations<DataType>::copy_scaled_vector(vector, vector_size, scale,
403  output_vector);
404 
405  return norm;
406 
407  #else
408 
409  // Norm of vector
410  DataType norm = cVectorOperations<DataType>::euclidean_norm(vector,
411  vector_size);
412 
413  // Normalize to output
414  for (LongIndexType i=0; i < vector_size; ++i)
415  {
416  output_vector[i] = vector[i] / norm;
417  }
418 
419  return norm;
420 
421  #endif
422 }
static void copy_scaled_vector(const DataType *input_vector, const LongIndexType vector_size, const DataType scale, DataType *output_vector)
Scales a vector and stores to a new vector.
static DataType euclidean_norm(const DataType *vector, const LongIndexType vector_size)
Computes the Euclidean norm of a 1D array.

References cVectorOperations< DataType >::copy_scaled_vector(), and cVectorOperations< DataType >::euclidean_norm().

Referenced by c_golub_kahn_bidiagonalization().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ normalize_vector_in_place()

template<typename DataType >
DataType cVectorOperations< DataType >::normalize_vector_in_place ( DataType *  vector,
const LongIndexType  vector_size 
)
static

Normalizes a vector based on Euclidean 2-norm. The result is written in-place.

Parameters
[in,out]vectorInput vector to be normalized in-place.
[in]vector_sizeLength of the input vector
Returns
2-Norm of the input vector (before normalization)

Definition at line 338 of file c_vector_operations.cpp.

341 {
342  #if (USE_CBLAS == 1)
343 
344  // Norm of vector
346  vector, vector_size);
347 
348  // Normalize in place
349  DataType scale = 1.0 / norm;
350  int incx = 1;
351  cblas_interface::xscal(vector_size, scale, vector, incx);
352 
353  return norm;
354 
355  #else
356 
357  // Norm of vector
358  DataType norm = cVectorOperations<DataType>::euclidean_norm(vector,
359  vector_size);
360 
361  // Normalize in place
362  for (LongIndexType i=0; i < vector_size; ++i)
363  {
364  vector[i] /= norm;
365  }
366 
367  return norm;
368 
369  #endif
370 }

References cVectorOperations< DataType >::euclidean_norm().

Referenced by c_golub_kahn_bidiagonalization().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ subtract_scaled_vector()

template<typename DataType >
void cVectorOperations< DataType >::subtract_scaled_vector ( const DataType *  input_vector,
const LongIndexType  vector_size,
const DataType  scale,
DataType *  output_vector 
)
static

Subtracts the scaled input vector from the output vector.

Performs the following operation:

\[ \boldsymbol{b} = \boldsymbol{b} - c \boldsymbol{a}, \]

where

  • \( \boldsymbol{a} \) is the input vector,
  • \( c \) is a scalar scale to the input vector, and
  • \( \boldsymbol{b} \) is the output vector that is written in-place.
Parameters
[in]input_vectorA 1D array
[in]vector_sizeLength of vector array
[in]scaleScale coefficient to the input vector.
[in,out]output_vectorOutput vector (written in place).

Definition at line 135 of file c_vector_operations.cpp.

140 {
141 
142  #if (USE_CBLAS == 1)
143 
144  // Using OpenBlas
145  int incx = 1;
146  int incy = 1;
147 
148  DataType neg_scale = -scale;
149  cblas_interface::xaxpy(vector_size, neg_scale, input_vector, incx,
150  output_vector, incy);
151 
152  #else
153 
154  // Not using OpenBlas
155  if (scale == 0.0)
156  {
157  return;
158  }
159 
160  for (LongIndexType i=0; i < vector_size; ++i)
161  {
162  output_vector[i] -= scale * input_vector[i];
163  }
164 
165  #endif
166 }

Referenced by cAffineMatrixFunction< DataType >::_add_scaled_vector(), c_golub_kahn_bidiagonalization(), c_lanczos_tridiagonalization(), cOrthogonalization< DataType >::gram_schmidt_process(), and cOrthogonalization< DataType >::orthogonalize_vectors().

Here is the caller graph for this function:

The documentation for this class was generated from the following files: