glearn.Covariance#

class glearn.Covariance(x, sigma=None, sigma0=None, scale=None, kernel=None, kernel_threshold=None, sparse=False, density=0.001, imate_options={'method': 'cholesky'}, interpolate=False, tol=1e-08, verbose=False)#

Create mixed covariance model.

This class creates mixed-covariance model which can be defined by a set of known or unknown hyperparameters and kernel functions. The covariance object can compute:

  • The auto-covariance or cross-covariance between a set of training and test points, or the derivative of the covariance with respect to the set of hyperparameters.

  • The covariance object can also compute basic matrix functions of the covariance matrix, such as log-determinant, or the trace of the functions of the matrix.

  • Solve a linear system or perform matrix-matrix or matrix-vector multiplication involving the covariance matrix or its derivatives with respect to hyperparameters.

Parameters:
xnumpy.ndarray

A 2D array of data points where each row of the array is the coordinate of a point \(\boldsymbol{x} = (x_1, \dots, x_d)\). The array size is \(n \times d\) where \(n\) is the number of the points and \(d\) is the dimension of the space of points.

sigmafloat, default=None

The hyperparameter \(\sigma\) of the covariance model where \(\sigma^2\) represents the variance of the correlated errors of the model. \(\sigma\) should be positive. If None is given, an optimal value for \(\sigma\) is found during the training process.

sigma0float, default=None

The hyperparameter \(\varsigma\) of the covariance model where \(\varsigma^2\) represents the variance of the input noise to the model. \(\varsigma\) should be positive. If None is given, an optimal value for \(\varsigma\) is found during the training process.

scalefloat or array_like[float], default=None

The scale hyperparameters \(\boldsymbol{\alpha} = (\alpha_1, \dots, \alpha_d)\) in scales the distance between data points in \(\mathbb{R}^d\). If an array of the size \(d\) is given, each \(\alpha_i\) scales the distance in the \(i\)-th dimension. If a scalar value \(\alpha\) is given, all dimensions are scaled isometrically. If set to None, optimal values of the scale hyperparameters are found during the training process by the automatic relevance determination (ARD).

kernelglearn.kernels.Kernel, default=glearn.kernels.Matern

The correlation kernel \(k\) that generates the correlation matrix \(\mathbf{K}\). This argument should be an instance of one of the derived classes of glearn.kernels.Kernel. If None, the Matern kernel glearn.kernels.Matern is used.

kernel_thresholdfloat, default=None,

The threshold \(\tau\) to taper the kernel function. Namely, the kernel values \(k < \tau\) are set to zero. This is used to decorrelate data points that are away from each other by a distance, yielding a sparse correlation matrix of the data points. This option is relevant if sparse is set to True.

sparsebool, default=False

It True, it sparsifies the correlation matrix \(\mathbf{K}\) and hence, the covariance matrix \(\boldsymbol{\Sigma}\) using kernel tapering (see kernel_threshold and density).

densityfloat, default=1e-3,

Sets an approximate density of sparse matrices. This argument is another way (along with kernel_threshold) to specify the sparsity of the covariance matrix. The matrix density is the This option is relevant if sparse is set to True.

Note

This option only sets an approximate density of the covariance matrix. The actual matrix density may be slightly different than the specified value.

imate_optionsdict, default={‘method’: ‘cholesky’}

The internal computations of the functions glearn.Covariance.logdet(), glearn.Covariance.trace(), and glearn.Covariance.traceinv() are performed by imate package. This argument can pass a dictionary of options to pass to the corresponding functions of the imate package. See API Reference of imate package for details.

interpolatebool, default=False

If True, the matrix functions glearn.Covariance.logdet(), glearn.Covariance.trace(), and glearn.Covariance.traceinv() for the mixed covariance function are interpolated with respect to the hyperparameters \(\sigma\) and \(\varsigma\). See [1] for details. This approach can yield a significant speed up during the training process but with the loss of accuracy.

tolfloat, default=1e-8

The tolerance of error of solving the linear systems using conjugate gradient method used in glearn.Covariance.solve() function.

verbosebool, default=False

It True, verbose output is printed during the computation.

Notes

Regression Model:

A regression model to fit the data \(y = f(\boldsymbol{x})\) for the points \(\boldsymbol{x} \in \mathcal{D} \in \mathbb{R}^d\) and data \(y \in \mathbb{R}\) is

\[f(\boldsymbol{x}) = \mu(\boldsymbol{x}) + \delta(\boldsymbol{x}) + \epsilon,\]

where

  • \(\mu\) is a deterministic mean function.

  • \(\delta\) is a zero-mean stochastic function representing the missfit of the regression model.

  • \(\epsilon\) is a zero-mean stochastic function representing the input noise.

Covariance of Regression:

The covariance of the stochastic function \(\delta\) on discrete data points \(\{ \boldsymbol{x}_i \}_{i=1}^n\) is the \(n \times n\) covariance matrix

\[\sigma^2 \mathbf{K} = \mathbb{E}[\delta(\boldsymbol{x}_i), \delta(\boldsymbol{x}_j)],\]

where \(\sigma^2\) is the variance and \(\mathbf{K}\) is considered as the correlation matrix.

Similarly, the covariance of the stochastic function \(\epsilon`is the :math:`n \times n\) covariance matrix

\[\varsigma^2 \mathbf{I} = \mathbb{E}[\epsilon, \epsilon],\]

where \(\varsigma^2\) is the variance of noise and \(\mathbf{I}\) is the identity matrix.

The overall mixed-covariance model for the linear model \(f\) is

\[\boldsymbol{\Sigma}(\sigma^2, \varsigma^2, \boldsymbol{\alpha}) = \sigma^2 \mathbf{K} + \varsigma^2 \mathbf{I}.\]

References

[1]

Ameli, S., and Shadden. S. C. (2022). Interpolating Log-Determinant and Trace of the Powers of Matrix \(\mathbf{A} + t\mathbf{B}\). arXiv: 2009.07385 [math.NA].

Examples

Create Covariance Object:

Create a covariance matrix based on a set of sample data with four points in \(d=2\) dimensional space.

>>> # Generate a set of points
>>> from glearn.sample_data import generate_points
>>> x = generate_points(num_points=4, dimension=2)

>>> # Create a covariance object
>>> from glearn import Covariance
>>> cov = Covariance(x)

By providing a set of hyperparameters, the covariance matrix can be fully defined. Here we set \(\sigma=2\), \(\varsigma=3\), and \(\boldsymbol{\alpha}= (1, 2)\).

>>> # Get the covariance matrix for given hyperparameters
>>> cov.set_sigmas(2.0, 3.0)
>>> cov.set_scale([1.0, 2.0])
>>> cov.get_matrix()
array([[13.        ,  3.61643745,  3.51285267,  3.47045163],
       [ 3.61643745, 13.        ,  3.32078482,  3.14804532],
       [ 3.51285267,  3.32078482, 13.        ,  3.53448631],
       [ 3.47045163,  3.14804532,  3.53448631, 13.        ]])

Specify Hyperparameter at Instantiation:

The hyperparameters can also be defined at the time of instantiating the covariance object.

>>> # Create a covariance object
>>> cov.traceinv(sigma=2.0, sigma0=3.0, scale=[1.0, 2.0])

Specify Correlation Kernel:

The kernel function that creates the correlation matrix \(\mathbf{K}\) can be specified by one of the kernel objects derived from glearn.kernels.Kernel class. For instance, in the next example, we set a square exponential kernel glearn.kernels.SquareExponential.

>>> # Create a kernel object
>>> from glearn import kernels
>>> kernel = kernels.SquareExponential()

>>> # Create covariance object with the above kernel
>>> cov.traceinv(kernel=kernel)

Sparse Covariance:

The covariance object can be configured to

Attributes:
corglearn._correlation.Correlation

An object representing the correlation matrix \(\mathbf{K}\).

corglearn._covariance.MixedCorrelation

An object representing the mixed correlation matrix \(\mathbf{K} + \eta \mathbf{I}\).

Methods

get_size()

Returns the size of the covariance matrix.

get_imate_options()

Returns the dictionary of options that is passed to the imate package.

set_imate_options(imate_options)

Updates the dictionary of options that is passed to the imate package.

set_scale(scale)

Sets the array of scale hyperparameters of the correlation matrix.

get_scale()

Returns the array of scale hyperparameters of the correlation matrix.

set_sigmas(sigma, sigma0)

Sets \(\sigma\) and \(\varsigma\) hyperparameters of the covariance model.

get_sigmas([sigma, sigma0])

Returns \(\sigma\) and \(\varsigma\) hyperparameters of the covariance model.

get_matrix([sigma, sigma0, scale, derivative])

Compute the covariance matrix or its derivatives for a given set of hyperparameters.

trace([sigma, sigma0, scale, p, derivative, ...])

Compute the trace of the positive powers of the covariance matrix or its derivatives.

traceinv([sigma, sigma0, B, C, scale, p, ...])

Compute the trace of the negative powers of the covariance matrix or its derivatives.

logdet([sigma, sigma0, scale, p, ...])

Compute the log-determinant of the powers of the covariance matrix or its derivatives.

solve(Y[, sigma, sigma0, scale, p, derivative])

Solve linear system involving the powers of covariance matrix or its derivatives.

dot(x[, sigma, sigma0, scale, p, derivative])

Matrix-vector or matrix-matrix multiplication involving the powers of covariance matrix or its derivatives.

auto_covariance(training_points)

Compute the auto-covariance between a set of test points.

cross_covariance(test_points)

Compute the cross-covariance between training and test points.