imate
C++/CUDA Reference
c_lanczos_tridiagonalization.h File Reference
Include dependency graph for c_lanczos_tridiagonalization.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Functions

template<typename DataType >
IndexType c_lanczos_tridiagonalization (cLinearOperator< DataType > *A, const DataType *v, const LongIndexType n, const IndexType m, const DataType lanczos_tol, const FlagType orthogonalize, DataType *alpha, DataType *beta)
 Tri-diagonalizes matrix A to T using the start vector v. is the Lanczos degree, which will be the size of square matrix T. More...
 

Function Documentation

◆ c_lanczos_tridiagonalization()

template<typename DataType >
IndexType c_lanczos_tridiagonalization ( cLinearOperator< DataType > *  A,
const DataType *  v,
const LongIndexType  n,
const IndexType  m,
const DataType  lanczos_tol,
const FlagType  orthogonalize,
DataType *  alpha,
DataType *  beta 
)

Tri-diagonalizes matrix A to T using the start vector v. is the Lanczos degree, which will be the size of square matrix T.

The output of this function is not an explicit matrix T, rather are the two arrays alpha of length m and beta of length m+1. The array alpha[:] represents the diagonal elements and beta[1:] represents the off-diagonal elements of the tri-diagonal (m,m) symmetric and positive-definite matrix T.

Lanczos tridiagonalization vs Golub-Kahn bidiagonalization

  • The Lanczos tri-diagonalization is twice faster (in runtime), as it has only one matrix-vector multiplication. Whereas the Golub-Kahn bi-diagonalization has two matrix-vector multiplications.
  • The Lanczos tri-diagonalization can only be applied to symmetric matrices. Whereas the Golub-Kahn bi-diagonalization can be applied to any matrix.

Algorithm

The algorithm and notations are obtained from [DEMMEL], p. 57, Algorithm 4.6 (see also [SAAD] p. 137, Algorithm 6.5). However there are four ways to implement the iteration. [PAIGE]_ has shown that the iteration that is implemented below is the most stable against loosing orthogonality of the eigenvectors. For details, see [CULLUM]_ p. 46, and p.48, particularly the algorithm denoted by A(2,7). The differences of these implementations are the order in which \( \alpha_j \) and \( \beta_j \) are defined and the order in which vectors are subtracted from r .

References

  • [DEMMEL] Demmel, J., Templates for solution of Algebraic Eigenvalue Problems, p. 57.
  • [SAAD] Saad, Numerical Methods for Large Eigenvalue Problems, p. 137.
  • [PAIGE] Paige (1980) Accuracy and effectiveness of the Lanczos algorithm for the symmetric eigenproblem.
  • [CULLUM] Cullum; Willoughby. Lanczos Algorithms for Large Symmetric Eigenvalue Computations. 1. pp.46-48.
Parameters
[in]AA linear operator that represents a matrix of size (n,n) and can perform matrix-vector operation with dot() method. This matrix should be positive-definite.
[in]vStart vector for the Lanczos tri-diagonalization. Column vector of size c n. It could be generated randomly. Often it is generated by the Rademacher distribution with entries c +1 and -1.
[in]nSize of the square matrix A, which is also the size of the vector v.
[in]mLanczos degree, which is the number of Lanczos iterations.
[in]lanczos_tolThe tolerance of the residual error of the Lanczos iteration.
[in]orthogonalizeIndicates whether to orthogonalize the orthogonal eigenvectors during Lanczos recursive iterations.
  • If set to 0, no orthogonalization is performed.
  • If set to a negative integer, a newly computed eigenvector is orthogonalized against all the previous eigenvectors (full reorthogonalization).
  • If set to a positive integer, say q less than lanczos_degree, the newly computed eigenvector is orthogonalized against the last q previous eigenvectors (partial reorthogonalization).
  • If set to an integer larger than lanczos_degree, it is cut to lanczos_degree, which effectively orthogonalizes against all previous eigenvectors (full reorthogonalization).
[out]alphaThis is a 1D array of size m. The array alpha[:] constitute the diagonal elements of the tri-diagonal matrix T. This is the output and written in place.
[out]betaThis is a 1D array of size m. The array beta[:] constitute the off-diagonals of the tri-diagonal matrix T. This array is the output and written in place.
Returns
Counter for the Lanczos iterations. Normally, the size of the output matrix should be (m,m), which is the Lanczos degree. However, if the algorithm terminates early, the size of alpha and beta, and hence the output tri-diagonal matrix, is smaller. This counter keeps track of the non-zero size of alpha and beta.

Definition at line 117 of file c_lanczos_tridiagonalization.cpp.

126 {
127  // buffer_size is number of last orthogonal vectors to keep in the buffer V
128  IndexType buffer_size;
129  if (orthogonalize == 0 || orthogonalize == 1)
130  {
131  // At least two vectors must be stored in buffer for Lanczos recursion
132  buffer_size = 2;
133  }
134  else if ((orthogonalize < 0) ||
135  (orthogonalize > static_cast<FlagType>(m)))
136  {
137  // Using full reorthogonalization, keep all of the m vectors in buffer
138  buffer_size = m;
139  }
140  else
141  {
142  // Orthogonalize with less than m vectors (0 < orthogonalize < m)
143  buffer_size = orthogonalize;
144  }
145 
146  // Allocate 2D array (as 1D array, and coalesced row-wise) to store
147  // the last buffer_size of orthogonalized vectors of length n. New vectors
148  // are stored by cycling through the buffer to replace with old ones.
149  DataType* V = new DataType[n * buffer_size];
150 
151  // Allocate vector r
152  DataType* r = new DataType[n];
153 
154  // Copy v into r
156 
157  // Initial beta
158  DataType initial_beta = cVectorOperations<DataType>::euclidean_norm(r, n);
159 
160  // Declare iterators
161  IndexType j;
162  IndexType lanczos_size = 0;
163  IndexType num_ortho;
164 
165  // In the following, beta[j] means beta[j-1] in the Demmel text
166  for (j=0; j < m; ++j)
167  {
168  // Update the size of Lanczos tridiagonal matrix
169  ++lanczos_size;
170 
171  // Normalize r and copy to the j-th column of V
172  if (j == 0)
173  {
175  r, n, 1.0/initial_beta, &V[(j % buffer_size)*n]);
176  }
177  else
178  {
180  r, n, 1.0/beta[j-1], &V[(j % buffer_size)*n]);
181  }
182 
183  // Multiply A to the j-th column of V, write into r
184  A->dot(&V[(j % buffer_size)*n], r);
185 
186  // alpha[j] is V[:, j] dot r
188  &V[(j % buffer_size)*n], r, n);
189 
190  // Subtract V[:,j] * alpha[j] from r
192  &V[(j % buffer_size)*n], n, alpha[j], r);
193 
194  // Subtract V[:,j-1] * beta[j] from r
195  if (j > 0)
196  {
198  &V[((j-1) % buffer_size)*n], n, beta[j-1], r);
199  }
200 
201  // Gram-Schmidt process (full re-orthogonalization)
202  if (orthogonalize != 0)
203  {
204  // Find how many column vectors are filled so far in the buffer V
205  if (j < buffer_size)
206  {
207  num_ortho = j+1;
208  }
209  else
210  {
211  num_ortho = buffer_size;
212  }
213 
214  // Gram-Schmidt process
216  &V[0], n, buffer_size, j%buffer_size, num_ortho, r);
217  }
218 
219  // beta is norm of r
221 
222  // Exit criterion when the vector r is zero. If each component of a
223  // zero vector has the tolerance epsilon, (which is called lanczos_tol
224  // here), the tolerance of norm of r is epsilon times sqrt of n.
225  if (beta[j] < lanczos_tol * sqrt(n))
226  {
227  break;
228  }
229  }
230 
231  // Free dynamic memory
232  delete[] V;
233  delete[] r;
234 
235  return lanczos_size;
236 }
virtual void dot(const DataType *vector, DataType *product)=0
static void gram_schmidt_process(const DataType *V, const LongIndexType vector_size, const IndexType num_vectors, const IndexType last_vector, const FlagType num_ortho, DataType *r)
Modified Gram-Schmidt orthogonalization process to orthogonalize the vector v against a subset of the...
static void copy_vector(const DataType *input_vector, const LongIndexType vector_size, DataType *output_vector)
Copies a vector to a new vector. Result is written in-place.
static void copy_scaled_vector(const DataType *input_vector, const LongIndexType vector_size, const DataType scale, DataType *output_vector)
Scales a vector and stores to a new vector.
static void subtract_scaled_vector(const DataType *input_vector, const LongIndexType vector_size, const DataType scale, DataType *output_vector)
Subtracts the scaled input vector from the output vector.
static DataType inner_product(const DataType *vector1, const DataType *vector2, const LongIndexType vector_size)
Computes Euclidean inner product of two vectors.
static DataType euclidean_norm(const DataType *vector, const LongIndexType vector_size)
Computes the Euclidean norm of a 1D array.
int FlagType
Definition: types.h:68
int IndexType
Definition: types.h:65

References cVectorOperations< DataType >::copy_scaled_vector(), cVectorOperations< DataType >::copy_vector(), cLinearOperator< DataType >::dot(), cVectorOperations< DataType >::euclidean_norm(), cOrthogonalization< DataType >::gram_schmidt_process(), cVectorOperations< DataType >::inner_product(), and cVectorOperations< DataType >::subtract_scaled_vector().

Referenced by cTraceEstimator< DataType >::_c_stochastic_lanczos_quadrature().

Here is the call graph for this function:
Here is the caller graph for this function: