matmul#
- detkit.matmul(A, B, C, shape=None, trans_a=False, trans_b=False, alpha=1, beta=0, overwrite=False)#
- Matrix-matrix multiplication for sub-matrices. - Parameters:
- Anumpy.ndarray
- A rectangular matrix with either - float64or- float32data type stored with column-major ordering.
- Anumpy.ndarray
- A rectangular matrix with either - float64or- float32data type stored with column-major ordering.
- Bnumpy.ndarray
- A rectangular matrix with either - float64or- float32data type stored with row-major or column-major ordering.
- shapetuple, default=None
- A tuple of size three, determining the shape of an upper-left sub-matrices of A, B, and C to be referenced. Namely, if - shapeis given as the tuple- (m, n, k), then- mis the number of rows of \(\mathrm{op}(\mathbf{A})\).
- nis the number of columns of \(\mathrm{op}(\mathbf{A})\), which is also the number of rows of \(\mathrm{op}(\mathbf{B})\).
- kis the number of columns of \(\mathrm{op}(\mathbf{C})\).
 - As such the following sub-matrices are used: - A[:m, :n](or- A[:n, :m]when- trans_a=True)
- B[:n, :k](or- B[:k, :n]when- trans_b=True)
- C[:m, :k]
 - If - shapeis None, the full shape of- A[:, :],- B[:, :], and- C[:, :]are considered provided that their shapes are compatible for the matrix product.
- trans_abool, default=False
- If False, the operator \(\operatorname{op}(\mathbf{A}) = \mathbf{A}\) is used. If True, the operator \(\operatorname{op}(\mathbf{A}) = \mathbf{A}^{\intercal}\) is used. 
- trans_abbool, default=False
- If False, the operator \(\operatorname{op}(\mathbf{B}) = \mathbf{B}\) is used. If True, the operator \(\operatorname{op}(\mathbf{B}) = \mathbf{B}^{\intercal}\) is used. 
- alphafloat, default=1
- The parameter \(\alpha\). 
- betafloat, default=0
- The parameter \(\beta\). 
- overwritebool, default=False
- If True, the output C is overwritten to the input C. - Note - When - overwriteis set to True, the input matrix C should have column-major (Fortran) ordering.
 
- Returns:
- Cnumpy.ndarray
- A 2D matrix of the same shape as the input matrix C (and not the shape of the sub-matrix). The upper-left sub-matrix of C contains the matrix-matrix multiplication corresponding to the sub-matrices determined by the - shapeargument. If- overwriteis set to True, the output matrix C is becomes a view for the matrix C.
 
 - See also - Notes - Matrix-matrix multiplication of sub-matrices: - Let \(\mathbf{A}_{[:m,:n]}\) denote the sub-matrix of the size \(m \times n\) to be the upper-left corner of matrix \(\mathbf{A}\). Given matrices \(\mathbf{A}\), \(\mathbf{B}\), and \(\mathbf{C}\) and parameters \(\alpha\) and \(\beta\), this function computes \[\mathbf{X}_{[:m, :k]} = \alpha \operatorname{op}(\mathbf{A})_{[:m, :n]} \operatorname{op}(\mathbf{B})_{[:n, :k]} + \beta \mathbf{C}_{[:m, :k]},\]- if - overwriteis False, or\[\mathbf{C}_{[:m, :k]} \leftarrow \alpha \operatorname{op}(\mathbf{A})_{[:m, :n]} \operatorname{op}(\mathbf{B})_{[:n, :k]} + \beta \mathbf{C}_{[:m, :k]},\]- if - overwriteis True.- Interpreting the output matrix: - The shape of the output variable X is the same as the shape of C, even if a smaller sub-matrix is considered. Regardless, only the corresponding upper-left sub-matrix of X has meaningful data. Namely, if - shape=(m, n, k), the slice- X[:m, :k]should be considered as the result. As such, the relation- X[:m, :k] = alpha * A[:m, :n] @ [:n, :k] + beta * C[:m, :k]- or, for instance, - X[:m, :k] = alpha * A[:n, :m].T @ [:k, :n].T + beta * C[:m, :k]- (when - trans_aand- trans_bare True) should hold.- Comparison with numpy.matmul: - To perfron matrix-matrix multiplication for a sub-matrix of the input matrices using - numpy.matmulfunction (or any other similar matrix multiplication functions), you should pass a slice of the matrix to the function. This approach is not memory-efficient since the sliced array allocates new memory.- In contrast, using - detkit.matmultogether with the- shapeargument, no memory slice is created during the inner computation, rather, the data from the original input matrix is accessed efficiently.- Implementation: - This function is a wrapper around LAPACK’s - sgemm(for 32-bit precision) and- dgemm(for 64-bit precision).- This function is internally used for - detkit.memdet()for efficient computation of matrix determinant under memory constraint.- References - Examples - >>> from detkit import matmul, Memory >>> import numpy >>> # Create random matrices >>> p = 1000 >>> A = numpy.random.randn(p, p) >>> B = numpy.random.randn(p, p) >>> C = numpy.random.randn(p, p) >>> # Make sure arrays have column-ordering >>> A = numpy.asfortranarray(A) >>> B = numpy.asfortranarray(B) >>> C = numpy.asfortranarray(C) >>> # 32-bit precision data >>> A = A.astype(numpy.float32) >>> B = B.astype(numpy.float32) >>> C = C.astype(numpy.float32) >>> # Get a copy of C (for the purpose of comparison) since we will >>> # overwrite C >>> C_copy = numpy.copy(C) >>> # Choose size of sub-matrices that are smaller or equal to p >>> m, n, k = 900, 800, 700 >>> # Track memory allocation to check if matmul operation is not >>> # creating any new memory. >>> mem = Memory() >>> mem.set() >>> # Perform matrix-matrix multiplication for sub-matrices >>> alpha, beta = numpy.float32(1.0), numpy.float32(2.0) >>> X = matmul(A, B, C, shape=(m, n, k), alpha=alpha, beta=beta, ... overwrite=True) >>> # Check peak memory allocation (compared to memory of a sum-matrix) >>> slice_nbytes = m * n * A.dtype.itemsize >>> print(mem.peak() / slice_nbytes) 0.001 >>> # Check if alpha * A @ B + beta * C_copy = X holds. >>> atol = numpy.finfo(A.dtype).resolution >>> print(numpy.allclose(alpha * A[:m, :n] @ B[:n, :k] + ... beta * C_copy[:m, :k], X[:m, :k], atol=10*tol)) True >>> # When overwrite is set to True, check if X is indeed a view of C >>> numpy.may_share_memory(X, C) True - In the above example, the object - memof class- detkit.Memorytracks memory allocation. The peak of allocated memory during the matrix multiplication is three orders of magnitude smaller than the size of one of the matrices slices, confirming that no new array slice was created during the operation.