Solving Linear Equations and Beyond: Exploring Pseudoinverses with torch.Tensor.pinverse()
What it does
- It allows you to solve systems of linear equations even when a true inverse doesn't exist.
- The pseudoinverse is a generalization of the inverse for non-square or singular matrices.
- Computes the pseudoinverse (Moore-Penrose inverse) of a matrix represented by a
torch.Tensor
.
Usage
import torch
# Example matrix (can be any shape)
A = torch.randn(3, 5) # Or any other dimensions
# Compute the pseudoinverse
A_pinv = torch.linalg.pinv(A) # Since PyTorch 1.12
# OR (deprecated in newer versions)
# A_pinv = torch.pinverse(A)
Key points
- Dtypes supported: float, double, complex float (cfloat), and complex double (cdouble).
- Output: A
torch.Tensor
of the same batch dimensions as the input, containing the pseudoinverse. - Input: A
torch.Tensor
of shape(*, m, n)
, where*
represents zero or more batch dimensions (can handle batches of matrices).
Behind the scenes
- The pseudoinverse is then constructed based on the singular values.
- SVD decomposes a matrix into three components: U, Σ, V^T, where:
- U: Left singular vectors (orthonormal)
- Σ: Diagonal matrix of singular values
- V^T: Right singular vectors (orthonormal)
torch.linalg.pinv
uses Singular Value Decomposition (SVD) to calculate the pseudoinverse.
Important considerations
- Non-differentiability
The pseudoinverse might not be differentiable for all input matrices, especially those with varying rank. This can affect backpropagation in neural networks. - rcond (optional)
A tolerance parameter (default: 1e-15) that controls how small singular values are considered negligible for the calculation. Smaller values might lead to more accurate inverses for ill-conditioned matrices, but can also increase computational cost.
Solving a system of linear equations with a non-square matrix
import torch
# Non-square matrix
A = torch.tensor([[1, 2, 3], [4, 5, 6]])
b = torch.tensor([7, 8]) # Right-hand side vector
# Pseudoinverse
A_pinv = torch.linalg.pinv(A)
# Solve Ax = b using the pseudoinverse
x = torch.mm(A_pinv, b) # Matrix multiplication
print("Original matrix A:\n", A)
print("Right-hand side vector b:\n", b)
print("Solution x using pseudoinverse:\n", x)
This code defines a non-square matrix A
and a right-hand side vector b
. It then calculates the pseudoinverse of A
and uses it to solve the system of linear equations Ax = b
. The solution vector x
is obtained using matrix multiplication.
Least squares solution with a singular matrix
import torch
# Singular matrix (has a zero singular value)
A = torch.tensor([[1, 2], [2, 4]])
b = torch.tensor([5, 10])
# Pseudoinverse with smaller tolerance (rcond) for better conditioning
A_pinv = torch.linalg.pinv(A, rcond=1e-8)
# Least squares solution
x = torch.mm(A_pinv, b)
print("Original (singular) matrix A:\n", A)
print("Right-hand side vector b:\n", b)
print("Least squares solution x:\n", x)
Here, we define a singular matrix A
and a right-hand side vector b
. We use a smaller rcond
value to account for the singular nature of A
and obtain a more accurate least squares solution using the pseudoinverse.
torch.linalg.lstsq
- It returns both the solution
x
and a residuals vector that can be useful for analyzing the quality of the fit. - This function is often a better choice for solving least squares problems. It provides a more efficient and numerically stable solution compared to explicitly computing the pseudoinverse.
import torch
A = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([5, 7])
solution, residuals = torch.linalg.lstsq(A, b)
print("Solution x:\n", solution)
print("Residuals:\n", residuals)
QR decomposition
- This method can be slightly more efficient for certain matrix structures compared to SVD-based approaches.
- If
A
has full column rank, the pseudoinverse can be computed asR.inverse() @ Q.transpose()
. - This approach involves decomposing the matrix
A
into a QR decomposition (Q, R).
import torch
A = torch.tensor([[1, 2], [3, 4]])
Q, R = torch.qr(A)
if torch.linalg.matrix_rank(A) == A.shape[1]: # Check for full column rank
A_pinv = torch.inverse(R) @ Q.t()
print("Pseudoinverse using QR decomposition:\n", A_pinv)
else:
print("QR decomposition might not be suitable in this case (not full column rank)")
Custom SVD implementation (advanced)
- However, this approach is more complex and requires a deeper understanding of linear algebra.
- This allows for fine-grained control over the decomposition process and can potentially be more efficient for specific matrix structures.
- In some specific scenarios, implementing your own SVD algorithm tailored to your needs might be desirable.
- Custom SVD should be reserved for very specific use cases requiring advanced control.
- QR decomposition might be considered for matrices with full column rank if computational efficiency is a concern.
torch.linalg.lstsq
is generally recommended for least squares problems due to its efficiency and stability.