imate
C++/CUDA Reference
Loading...
Searching...
No Matches
cuOrthogonalization< DataType > Class Template Reference

A static class for orthogonalization of vector bases. This class acts as a templated namespace, where all member methods are public and static. More...

#include <cu_orthogonalization.h>

Static Public Member Functions

static void gram_schmidt_process (cublasHandle_t cublas_handle, const DataType *V, const LongIndexType vector_size, const IndexType num_vectors, const IndexType last_vector, const FlagType num_ortho, DataType *r)
 Modified Gram-Schmidt orthogonalization process to orthogonalize the vector v against a subset of the column vectors in the array V.
 
static void orthogonalize_vectors (cublasHandle_t cublas_handle, DataType *vectors, const LongIndexType vector_size, const IndexType num_vectors)
 Orthogonalizes set of vectors mutually using modified Gram-Schmidt process.
 

Detailed Description

template<typename DataType>
class cuOrthogonalization< DataType >

A static class for orthogonalization of vector bases. This class acts as a templated namespace, where all member methods are public and static.

See also
RandomVectors

Definition at line 53 of file cu_orthogonalization.h.

Member Function Documentation

◆ gram_schmidt_process()

template<typename DataType >
void cuOrthogonalization< DataType >::gram_schmidt_process ( cublasHandle_t  cublas_handle,
const DataType *  V,
const LongIndexType  vector_size,
const IndexType  num_vectors,
const IndexType  last_vector,
const FlagType  num_ortho,
DataType *  v 
)
static

Modified Gram-Schmidt orthogonalization process to orthogonalize the vector v against a subset of the column vectors in the array V.

V is 1D array of the length vector_size*num_vectors to represent a 2D array of a set of num_vectors column vectors, each of the length vector_size. The length of v is also vector_size.

v is orthogonalized against the last num_ortho columns of V starting from the column vector of the index last_vector. If the backward indexing from last_vector becomes a negative index, the index wraps around from the last column vector index, i.e., num_vectors-1 .

  • If num_ortho is zero, or if num_vectors is zero, no orthogonalization is performed.
  • If num_ortho is negative (usually set to -1), then v is orthogonalized against all column vectors of V.
  • If num_ortho is larger than num_vectors, then v is orthogonalized against all column vectors of V.
  • If num_ortho is smaller than num_vectors, then v is orthogonalized against the last num_ortho column vectors of V, starting from the column vector with the index last_vector toward its previous vectors. If the iteration runs into negativen column indices, the column indexing wraps around from the end of the columns from num_vectors-1.

The result of the newer v is written in-place in v.

If vector v is identical to one of the vectors in V, the orthogonalization against the identical vector is skipped.

If one of the column vectors of V is zero (have zero norm), that vector is ignored.

Note
It is assumed that the caller function fills the column vectors of V periodically in a wrapped around order from column index 0,1,... to num_vectors-1, and newer vectors are replaced on the wrapped index starting from index 0,1,... again. Thus, V only stores the last num_vectors column vectors. The index of the last filled vector is indicated by last_vector.
Warning
The vector v can be indeed one of the columns of V itself. However, in this case, vector v must NOT be orthogonalized against itself, rather, it should only be orthogonalized to the other vectors in V. For instance, if num_vectors=10, and v is the 3rd vector of V, and if num_ortho is 6, then we may set last_vector=2. Then v is orthogonalized againts the six columns 2,1,0,9,8,7, where the last three of them are wrapped around the end of the columns.
See also
cu_golub_kahn_bidiagonalizaton, cu_lanczos_bidiagonalization
Parameters
[in]cublas_handleThe cuBLAS object handle.
[in]V1D coalesced array of vectors representing a 2D array. The length of this 1D array is vector_size*num_vectors, which indicates a 2D array with the shape (vector_size,num_vectors).
[in]vector_sizeThe length of each vector. If we assume V indicates a 2D vector, this is the number of rows of V.
[in]num_vectorsThe number of column vectors. If we assume V indicates a 2D vector, this the number of columns of V.
[in]last_vectorThe column vectors of the array V are rewritten by the caller function in wrapped-around order. That is, once all the columns (from the zeroth to the num_vector-1 vector) are filled, the next vector is rewritten in the place of the zeroth vector, and the indices of newer vectors wrap around the columns of V. Thus, V only retains the last num_vectors vectors. The column index of the last written vector is given by last_vector. This index is a number between 0 and num_vectors-1. The index of the last i-th vector is winding back from the last vector by last_vector-i+1 mod num_vectors.
[in]num_orthoThe number of vectors to be orthogonalized starting from the last vector. 0 indicates no orthogonalization will be performed and the function just returns. A negative value means all vectors will be orthogonalized. A poisitive value will orthogonalize the given number of vectors. This value cannot be larger than the number of vectors.
[in,out]vThe vector that will be orthogonalized against the columns of V. The length of v is vector_size. This vector is modified in-place.

Definition at line 131 of file cu_orthogonalization.cu.

139{
140 // Determine how many previous vectors to orthogonalize against
141 IndexType num_steps;
142 if ((num_ortho == 0) || (num_vectors < 2))
143 {
144 // No orthogonalization is performed
145 return;
146 }
147 else if ((num_ortho < 0) ||
148 (num_ortho > static_cast<FlagType>(num_vectors)))
149 {
150 // Orthogonalize against all vectors
151 num_steps = num_vectors;
152 }
153 else
154 {
155 // Orthogonalize against only the last num_ortho vectors
156 num_steps = num_ortho;
157 }
158
159 // Vectors can be orthogonalized at most to the full basis of the vector
160 // space. Thus, num_steps cannot be larger than the dimension of vector
161 // space, which is vector_size.
162 if (num_steps > static_cast<IndexType>(vector_size))
163 {
164 num_steps = vector_size;
165 }
166
167 IndexType i;
168 DataType inner_prod;
169 DataType norm;
170 DataType norm_v;
172 DataType distance2;
173
174 // Iterate over vectors
175 for (IndexType step=0; step < num_steps; ++step)
176 {
177 // i is the index of a column vector in V to orthogonalize v against it
178 if ((last_vector % num_vectors) >= step)
179 {
180 i = (last_vector % num_vectors) - step;
181 }
182 else
183 {
184 // Wrap around negative indices from the end of column index
185 i = (last_vector % num_vectors) - step + num_vectors;
186 }
187
188 // Norm of j-th vector
190 cublas_handle, &V[vector_size*i], vector_size);
191
192 // Check norm
193 if (norm < \
195 epsilon,
197 std::sqrt(vector_size))
198 )
199 )
200 {
201 std::cerr << "WARNING: norm of the given vector is too small. " \
202 << "Cannot orthogonalize against zero vector. " \
203 << "Skipping." << std::endl;
204 continue;
205 }
206
207 // Projection
209 cublas_handle, &V[vector_size*i], v, vector_size);
210
211 // scale for subtraction
212 DataType scale = cu_arithmetics::div(
213 inner_prod,
214 cu_arithmetics::mul(norm, norm));
215 DataType scale_abs = cu_arithmetics::abs(scale);
216 DataType one = cu_arithmetics::cast<double, DataType>(1.0);
217 DataType two = cu_arithmetics::cast<double, DataType>(2.0);
218
219 // If scale is 1, it is possible that vector v and j-th vector are
220 // identical (or close).
221 if (cu_arithmetics::abs(cu_arithmetics::sub(scale_abs, one)) <= \
222 cu_arithmetics::mul(two, epsilon))
223 {
224 // Norm of the vector v
226 cublas_handle, v, vector_size);
227
228 // Compute distance between the j-th vector and vector v
229 distance2 = \
230 cu_arithmetics::add(
231 cu_arithmetics::mul(norm_v, norm_v),
232 cu_arithmetics::mul(-two, inner_prod),
233 cu_arithmetics::mul(norm, norm)
234 );
235
236 // If distance is zero, do not re-orthogonalize i-th against
237 // the j-th vector.
238 if (distance2 < \
240 two,
241 epsilon,
243 static_cast<unsigned long long int>(vector_size))
244 )
245 )
246 {
247 continue;
248 }
249 }
250
251 // Subtraction
253 cublas_handle, &V[vector_size*i], vector_size, scale, v);
254 }
255}
static void subtract_scaled_vector(cublasHandle_t cublas_handle, const DataType *RESTRICT input_vector, const LongIndexType vector_size, const DataType scale, DataType *RESTRICT output_vector)
Subtracts the scaled input vector from the output vector.
static DataType inner_product(cublasHandle_t cublas_handle, const DataType *RESTRICT vector1, const DataType *RESTRICT vector2, const LongIndexType vector_size)
Computes Euclidean inner product of two vectors.
static DataType euclidean_norm(cublasHandle_t cublas_handle, const DataType *RESTRICT vector, const LongIndexType vector_size)
Computes the Euclidean 2-norm of a 1D array.
__host__ __device__ DataType mul(const DataType x, const DataType y)
Multiply two floating point numbers in round-to-nearest-even mode.
__host__ __device__ DataType abs(const DataType x)
Absolute value of a floating point number.
__host__ __device__ DataType sub(const DataType x, const DataType y)
Subtract two floating point numbers in round-to-nearest-even mode.
__host__ __device__ DataType div(const DataType x, const DataType y)
Divide two floating point numbers in round-to-nearest-even mode.
__host__ __device__ DataType epsilon()
epsilon for various floating point precisions.
int FlagType
Definition types.h:68
int IndexType
Definition types.h:65

References cu_arithmetics::abs(), cu_arithmetics::div(), cuVectorOperations< DataType >::euclidean_norm(), cuVectorOperations< DataType >::inner_product(), cu_arithmetics::mul(), cu_arithmetics::sub(), and cuVectorOperations< DataType >::subtract_scaled_vector().

Referenced by cu_golub_kahn_bidiagonalization(), and cu_lanczos_tridiagonalization().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ orthogonalize_vectors()

template<typename DataType >
void cuOrthogonalization< DataType >::orthogonalize_vectors ( cublasHandle_t  cublas_handle,
DataType *  vectors,
const LongIndexType  vector_size,
const IndexType  num_vectors 
)
static

Orthogonalizes set of vectors mutually using modified Gram-Schmidt process.

Note
Let m be the number of vectors (num_vectors), and let n be the size of each vector (vector_size). In general, n is much larger (large matrix size), and m is small, in order of a couple of hundred. But for small matrices (where n could be smaller then m), then each vector can be orthogonalized at most to n other vectors. This is because the dimension of the vector space is n. Thus, if there are extra vectors, each vector is orthogonalized to window of the previous n vector.

If one of the column vectors is identical to one of other column vectors in V, one of the vectors is regenerated by random array and the orthogonalization is repeated.

Note
If two vectors are identical (or the norm of their difference is very small), they cannot be orthogonalized against each other. In this case, one of the vectors is re-generated by new random numbers.
Warning
if num_vectors is larger than vector_size, the orthogonalization fails since not all vectors are independent, and at least one vector becomes zero.
Parameters
[in]cublas_handleThe cuBLAS object handle.
[in,out]vectors2D array of size vector_size*num_vectors. This array will be modified in-place and will be output of this function. Note that this is Fortran ordering, meaning that the first index is contiguous. Hence, to call the j-th element of the i-th vector, use &vectors[i*vector_size + j].
[in]num_vectorsNumber of columns of vectors array.
[in]vector_sizeNumber of rows of vectors array.

Definition at line 302 of file cu_orthogonalization.cu.

307{
308 // Do nothing if there is only one vector
309 if (num_vectors < 2)
310 {
311 return;
312 }
313
314 IndexType i = 0;
315 IndexType j;
316 IndexType start_j;
317 DataType inner_prod;
318 DataType norm_j;
319 DataType norm_i;
321 IndexType success = 1;
322 IndexType max_num_trials = 20;
323 IndexType num_trials = 0;
324 IndexType num_threads = 1;
325 RandomNumberGenerator random_number_generator(num_threads);
326 DataType* buffer = NULL;
327
328 while (i < num_vectors)
329 {
330 if ((success == 0) && (num_trials >= max_num_trials))
331 {
332 std::cerr << "ERROR: Cannot orthogonalize vectors after " \
333 << num_trials << " trials. Aborting." \
334 << std::endl;
335 abort();
336 }
337
338 // Reset on new trial (if it was set to 0 before to start a new trial)
339 success = 1;
340
341 // j iterates on previous vectors in a window of at most vector_size
342 if (static_cast<LongIndexType>(i) > vector_size)
343 {
344 // When vector_size is smaller than i, it is fine to cast to signed
345 start_j = i - static_cast<IndexType>(vector_size);
346 }
347 else
348 {
349 start_j = 0;
350 }
351
352 // Re-orthogonalize against previous vectors
353 for (j=start_j; j < i; ++j)
354 {
355 // Norm of the j-th vector
357 cublas_handle, &vectors[j*vector_size], vector_size);
358
359 // Check norm
360 if (norm_j < \
362 epsilon,
364 std::sqrt(vector_size))
365 )
366 )
367 {
368 std::cerr << "WARNING: norm of the given vector is too " \
369 << " small. Cannot re-orthogonalize against zero" \
370 << "vector. Skipping."
371 << std::endl;
372 continue;
373 }
374
375 // Projecting i-th vector to j-th vector
377 cublas_handle, &vectors[i*vector_size],
378 &vectors[j*vector_size], vector_size);
379
380 // Scale of subtraction
381 DataType scale = cu_arithmetics::div(
382 inner_prod,
383 cu_arithmetics::mul(norm_j, norm_j));
384
385 // Subtraction
387 cublas_handle, &vectors[vector_size*j], vector_size, scale,
388 &vectors[vector_size*i]);
389
390 // Norm of the i-th vector
392 cublas_handle, &vectors[i*vector_size], vector_size);
393
394 // If the norm is too small, regenerate the i-th vector randomly
395 if (norm_i < \
397 epsilon,
399 std::sqrt(vector_size))
400 )
401 )
402 {
403 // Allocate buffer
404 if (buffer == NULL)
405 {
406 buffer = new DataType[vector_size];
407 }
408
409 // Regenerate new random vector on buffer
411 random_number_generator, buffer,
412 vector_size, num_threads);
413
414 // Copy buffer to the i-th vector on device
416 buffer, vector_size, &vectors[i*vector_size]);
417
418 // Repeat the re-orthogonalization for i-th vector against
419 // all previous vectors again.
420 success = 0;
421 ++num_trials;
422 break;
423 }
424 }
425
426 if (success == 1)
427 {
428 ++i;
429
430 // Reset if num_trials was incremented before.
431 num_trials = 0;
432 }
433 }
434
435 // Deallocate buffer
436 if (buffer != NULL)
437 {
438 delete[] buffer;
439 buffer = NULL;
440 }
441}
static void copy_to_device(const ArrayType *host_array, const size_t array_size, ArrayType *device_array)
Copies memory on host to device memory.
Definition cuda_api.cu:145
static void generate_random_array(RandomNumberGenerator &random_number_generator, DataType *array, const LongIndexType array_size, const IndexType num_threads)
Generates a pseudo-random array with Rademacher distribution where elements are either +1 or -1.
Generates 64-bit integers on multiple parallel threads.
int LongIndexType
Definition types.h:60

References cu_arithmetics::abs(), CudaAPI< ArrayType >::copy_to_device(), cu_arithmetics::div(), cuVectorOperations< DataType >::euclidean_norm(), RandomArrayGenerator< DataType >::generate_random_array(), cuVectorOperations< DataType >::inner_product(), cu_arithmetics::mul(), and cuVectorOperations< DataType >::subtract_scaled_vector().

Here is the call graph for this function:

The documentation for this class was generated from the following files: