glearn.Covariance#
- class glearn.Covariance(x, sigma=None, sigma0=None, scale=None, kernel=None, kernel_threshold=None, sparse=False, density=0.001, imate_options={'method': 'cholesky'}, interpolate=False, tol=1e-08, verbose=False)#
Create mixed covariance model.
This class creates mixed-covariance model which can be defined by a set of known or unknown hyperparameters and kernel functions. The covariance object can compute:
The auto-covariance or cross-covariance between a set of training and test points, or the derivative of the covariance with respect to the set of hyperparameters.
The covariance object can also compute basic matrix functions of the covariance matrix, such as log-determinant, or the trace of the functions of the matrix.
Solve a linear system or perform matrix-matrix or matrix-vector multiplication involving the covariance matrix or its derivatives with respect to hyperparameters.
- Parameters:
- xnumpy.ndarray
A 2D array of data points where each row of the array is the coordinate of a point \(\boldsymbol{x} = (x_1, \dots, x_d)\). The array size is \(n \times d\) where \(n\) is the number of the points and \(d\) is the dimension of the space of points.
- sigmafloat, default=None
The hyperparameter \(\sigma\) of the covariance model where \(\sigma^2\) represents the variance of the correlated errors of the model. \(\sigma\) should be positive. If None is given, an optimal value for \(\sigma\) is found during the training process.
- sigma0float, default=None
The hyperparameter \(\varsigma\) of the covariance model where \(\varsigma^2\) represents the variance of the input noise to the model. \(\varsigma\) should be positive. If None is given, an optimal value for \(\varsigma\) is found during the training process.
- scalefloat or array_like[float], default=None
The scale hyperparameters \(\boldsymbol{\alpha} = (\alpha_1, \dots, \alpha_d)\) in scales the distance between data points in \(\mathbb{R}^d\). If an array of the size \(d\) is given, each \(\alpha_i\) scales the distance in the \(i\)-th dimension. If a scalar value \(\alpha\) is given, all dimensions are scaled isometrically. If set to None, optimal values of the scale hyperparameters are found during the training process by the automatic relevance determination (ARD).
- kernelglearn.kernels.Kernel, default=glearn.kernels.Matern
The correlation kernel \(k\) that generates the correlation matrix \(\mathbf{K}\). This argument should be an instance of one of the derived classes of
glearn.kernels.Kernel
. If None, the Matern kernelglearn.kernels.Matern
is used.- kernel_thresholdfloat, default=None,
The threshold \(\tau\) to taper the kernel function. Namely, the kernel values \(k < \tau\) are set to zero. This is used to decorrelate data points that are away from each other by a distance, yielding a sparse correlation matrix of the data points. This option is relevant if
sparse
is set to True.- sparsebool, default=False
It True, it sparsifies the correlation matrix \(\mathbf{K}\) and hence, the covariance matrix \(\boldsymbol{\Sigma}\) using kernel tapering (see
kernel_threshold
anddensity
).- densityfloat, default=1e-3,
Sets an approximate density of sparse matrices. This argument is another way (along with
kernel_threshold
) to specify the sparsity of the covariance matrix. The matrix density is the This option is relevant ifsparse
is set to True.Note
This option only sets an approximate density of the covariance matrix. The actual matrix density may be slightly different than the specified value.
- imate_optionsdict, default={‘method’: ‘cholesky’}
The internal computations of the functions
glearn.Covariance.logdet()
,glearn.Covariance.trace()
, andglearn.Covariance.traceinv()
are performed by imate package. This argument can pass a dictionary of options to pass to the corresponding functions of the imate package. See API Reference of imate package for details.- interpolatebool, default=False
If True, the matrix functions
glearn.Covariance.logdet()
,glearn.Covariance.trace()
, andglearn.Covariance.traceinv()
for the mixed covariance function are interpolated with respect to the hyperparameters \(\sigma\) and \(\varsigma\). See [1] for details. This approach can yield a significant speed up during the training process but with the loss of accuracy.- tolfloat, default=1e-8
The tolerance of error of solving the linear systems using conjugate gradient method used in
glearn.Covariance.solve()
function.- verbosebool, default=False
It True, verbose output is printed during the computation.
See also
Notes
Regression Model:
A regression model to fit the data \(y = f(\boldsymbol{x})\) for the points \(\boldsymbol{x} \in \mathcal{D} \in \mathbb{R}^d\) and data \(y \in \mathbb{R}\) is
\[f(\boldsymbol{x}) = \mu(\boldsymbol{x}) + \delta(\boldsymbol{x}) + \epsilon,\]where
\(\mu\) is a deterministic mean function.
\(\delta\) is a zero-mean stochastic function representing the missfit of the regression model.
\(\epsilon\) is a zero-mean stochastic function representing the input noise.
Covariance of Regression:
The covariance of the stochastic function \(\delta\) on discrete data points \(\{ \boldsymbol{x}_i \}_{i=1}^n\) is the \(n \times n\) covariance matrix
\[\sigma^2 \mathbf{K} = \mathbb{E}[\delta(\boldsymbol{x}_i), \delta(\boldsymbol{x}_j)],\]where \(\sigma^2\) is the variance and \(\mathbf{K}\) is considered as the correlation matrix.
Similarly, the covariance of the stochastic function \(\epsilon`is the :math:`n \times n\) covariance matrix
\[\varsigma^2 \mathbf{I} = \mathbb{E}[\epsilon, \epsilon],\]where \(\varsigma^2\) is the variance of noise and \(\mathbf{I}\) is the identity matrix.
The overall mixed-covariance model for the linear model \(f\) is
\[\boldsymbol{\Sigma}(\sigma^2, \varsigma^2, \boldsymbol{\alpha}) = \sigma^2 \mathbf{K} + \varsigma^2 \mathbf{I}.\]References
[1]Ameli, S., and Shadden. S. C. (2022). Interpolating Log-Determinant and Trace of the Powers of Matrix \(\mathbf{A} + t\mathbf{B}\). arXiv: 2009.07385 [math.NA].
Examples
Create Covariance Object:
Create a covariance matrix based on a set of sample data with four points in \(d=2\) dimensional space.
>>> # Generate a set of points >>> from glearn.sample_data import generate_points >>> x = generate_points(num_points=4, dimension=2) >>> # Create a covariance object >>> from glearn import Covariance >>> cov = Covariance(x)
By providing a set of hyperparameters, the covariance matrix can be fully defined. Here we set \(\sigma=2\), \(\varsigma=3\), and \(\boldsymbol{\alpha}= (1, 2)\).
>>> # Get the covariance matrix for given hyperparameters >>> cov.set_sigmas(2.0, 3.0) >>> cov.set_scale([1.0, 2.0]) >>> cov.get_matrix() array([[13. , 3.61643745, 3.51285267, 3.47045163], [ 3.61643745, 13. , 3.32078482, 3.14804532], [ 3.51285267, 3.32078482, 13. , 3.53448631], [ 3.47045163, 3.14804532, 3.53448631, 13. ]])
Specify Hyperparameter at Instantiation:
The hyperparameters can also be defined at the time of instantiating the covariance object.
>>> # Create a covariance object >>> cov.traceinv(sigma=2.0, sigma0=3.0, scale=[1.0, 2.0])
Specify Correlation Kernel:
The kernel function that creates the correlation matrix \(\mathbf{K}\) can be specified by one of the kernel objects derived from
glearn.kernels.Kernel
class. For instance, in the next example, we set a square exponential kernelglearn.kernels.SquareExponential
.>>> # Create a kernel object >>> from glearn import kernels >>> kernel = kernels.SquareExponential() >>> # Create covariance object with the above kernel >>> cov.traceinv(kernel=kernel)
Sparse Covariance:
The covariance object can be configured to
- Attributes:
- corglearn._correlation.Correlation
An object representing the correlation matrix \(\mathbf{K}\).
- corglearn._covariance.MixedCorrelation
An object representing the mixed correlation matrix \(\mathbf{K} + \eta \mathbf{I}\).
Methods
get_size
()Returns the size of the covariance matrix.
Returns the dictionary of options that is passed to the imate package.
set_imate_options
(imate_options)Updates the dictionary of options that is passed to the imate package.
set_scale
(scale)Sets the array of scale hyperparameters of the correlation matrix.
Returns the array of scale hyperparameters of the correlation matrix.
set_sigmas
(sigma, sigma0)Sets \(\sigma\) and \(\varsigma\) hyperparameters of the covariance model.
get_sigmas
([sigma, sigma0])Returns \(\sigma\) and \(\varsigma\) hyperparameters of the covariance model.
get_matrix
([sigma, sigma0, scale, derivative])Compute the covariance matrix or its derivatives for a given set of hyperparameters.
trace
([sigma, sigma0, scale, p, derivative, ...])Compute the trace of the positive powers of the covariance matrix or its derivatives.
traceinv
([sigma, sigma0, B, C, scale, p, ...])Compute the trace of the negative powers of the covariance matrix or its derivatives.
logdet
([sigma, sigma0, scale, p, ...])Compute the log-determinant of the powers of the covariance matrix or its derivatives.
solve
(Y[, sigma, sigma0, scale, p, derivative])Solve linear system involving the powers of covariance matrix or its derivatives.
dot
(x[, sigma, sigma0, scale, p, derivative])Matrix-vector or matrix-matrix multiplication involving the powers of covariance matrix or its derivatives.
auto_covariance
(training_points)Compute the auto-covariance between a set of test points.
cross_covariance
(test_points)Compute the cross-covariance between training and test points.