imate Documentation#
imate, short for Implicit Matrix Trace Estimator, is a modular and high-performance C++/CUDA library distributed as a Python package that provides scalable randomized algorithms for the computationally expensive matrix functions in machine learning.
Overview#
To learn more about imate functionality, see:
Supported Platforms#
Successful installation and tests performed on the following operating systems, architectures, and Python and PyPy versions:
Platform |
Arch |
Device |
Python Version |
PyPy Version 1 |
Continuous Integration |
|||||
---|---|---|---|---|---|---|---|---|---|---|
3.9 |
3.10 |
3.11 |
3.12 |
3.8 |
3.9 |
3.10 |
||||
Linux |
X86-64 |
CPU |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
|
GPU |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
|||
AARCH-64 |
CPU |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
||
GPU |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
|||
macOS |
X86-64 |
CPU |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
|
GPU 2 |
✖ |
✖ |
✖ |
✖ |
✖ |
✖ |
✖ |
|||
ARM-64 |
CPU |
✔ |
✔ |
✔ |
✔ |
✖ |
✔ |
✔ |
||
GPU 2 |
✖ |
✖ |
✖ |
✖ |
✖ |
✖ |
✖ |
|||
Windows |
X86-64 |
CPU |
✔ |
✔ |
✔ |
✔ |
✖ |
✖ |
✖ |
|
GPU |
✔ |
✔ |
✔ |
✔ |
✖ |
✖ |
✖ |
Python wheels for imate for all supported platforms and versions in the above are available through PyPI and Anaconda Cloud. If you need imate on other platforms, architectures, and Python or PyPy versions, raise an issue on GitHub and we build its Python Wheel for you.
Install#
Install with pip
from PyPI:
pip install imate
Install with conda
from Anaconda Cloud:
conda install -c s-ameli imate
For complete installation guide, see:
Docker#
The docker image comes with a pre-installed imate, an NVIDIA graphic driver, and a compatible version of CUDA Toolkit libraries.
Pull docker image from Docker Hub:
docker pull sameli/imate
For a complete guide, see:
GPU#
imate can run on CUDA-capable multi-GPU devices, which can be set up in several ways. Using the docker container is the easiest way to run imate on GPU devices. For a comprehensive guide, see:
The supported GPU micro-architectures and CUDA version are as follows:
Version \ Arch |
Fermi |
Kepler |
Maxwell |
Pascal |
Volta |
Turing |
Ampere |
Hopper |
---|---|---|---|---|---|---|---|---|
CUDA 9 |
✖ |
✖ |
✖ |
✖ |
✖ |
✖ |
✖ |
✖ |
CUDA 10 |
✖ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
✔ |
CUDA 11 |
✖ |
✖ |
✖ |
✔ |
✔ |
✔ |
✔ |
✔ |
CUDA 12 |
✖ |
✖ |
✖ |
✔ |
✔ |
✔ |
✔ |
✔ |
Tutorials#
Launch online interactive notebook with Binder.
API Reference#
Check the list of functions, classes, and modules of imate with their usage, options, and examples.
Performance#
imate is scalable to very large matrices. Its core library for basic linear algebraic operations is faster than OpenBLAS, and its pseudo-random generator is a hundred-fold faster than the implementation in the standard C++ library.
Read about the performance of imate in practical applications:
Features#
Matrices can be dense or sparse (CSR or CSC format), with 32-bit, 64-bit, or 128-bit data types, and stored either by row-ordering (C style) or column-ordering (Fortran style).
Matrices can be linear operators with parameters (see
imate.Matrix
andimate.AffineMatrixFunction
classes).Randomized algorithms using Hutchinson and stochastic Lanczos quadrature algorithms (see Overview)
Novel method to interpolate matrix functions. See Interpolation of Affine Matrix Functions.
Parallel processing both on shared memory and CUDA Capable multi-GPU devices.
Technical Notes#
The core of imate, which is implemented in C++ and NVIDIA CUDA framework, is a standalone modular library for high-performance low-level algebraic operations on linear operators (including matrices and affine matrix functions). This library provides a unified interface for computations on both CPU and GPU, a unified interface for dense and sparse matrices, a unified container for various data types, and fully automatic memory management and data transfer between CPU and GPU devices on demand. This library can be employed independently for projects other than imate. The Doxygen generated reference of C++/CUDA Classes and Namespaces of imate is available for developers.
The front-end interface of imate is implemented in Cython and Python (see Python API Reference for end-users).
Some notable implementation techniques used to develop imate are:
Polymorphic and curiously recurring template pattern programming (CRTP) technique.
OS-independent customized dynamic loading of CUDA libraries (as opposed to dynamic linking).
Static dispatching enables executing imate with and without CUDA on the user’s machine with the same pre-compiled imate installation.
Completely GIL-free Cython implementation.
Providing manylinux wheels build upon customized docker images with CUDA support available on DockerHub:
How to Contribute#
We welcome contributions via GitHub’s pull request. If you do not feel comfortable modifying the code, we also welcome feature requests and bug reports as GitHub issues.
Publications#
For information on how to cite imate, publications, and software packages that used imate, see:
License#
This project uses a BSD 3-clause license, in hopes that it will be accessible to most projects. If you require a different license, please raise an issue and we will consider a dual license.