Solving Linear Equations and Beyond: Exploring Pseudoinverses with torch.Tensor.pinverse()


What it does

  • It allows you to solve systems of linear equations even when a true inverse doesn't exist.
  • The pseudoinverse is a generalization of the inverse for non-square or singular matrices.
  • Computes the pseudoinverse (Moore-Penrose inverse) of a matrix represented by a torch.Tensor.

Usage

import torch

# Example matrix (can be any shape)
A = torch.randn(3, 5)  # Or any other dimensions

# Compute the pseudoinverse
A_pinv = torch.linalg.pinv(A)  # Since PyTorch 1.12
# OR (deprecated in newer versions)
# A_pinv = torch.pinverse(A)

Key points

  • Dtypes supported: float, double, complex float (cfloat), and complex double (cdouble).
  • Output: A torch.Tensor of the same batch dimensions as the input, containing the pseudoinverse.
  • Input: A torch.Tensor of shape (*, m, n), where * represents zero or more batch dimensions (can handle batches of matrices).

Behind the scenes

  • The pseudoinverse is then constructed based on the singular values.
  • SVD decomposes a matrix into three components: U, Σ, V^T, where:
    • U: Left singular vectors (orthonormal)
    • Σ: Diagonal matrix of singular values
    • V^T: Right singular vectors (orthonormal)
  • torch.linalg.pinv uses Singular Value Decomposition (SVD) to calculate the pseudoinverse.

Important considerations

  • Non-differentiability
    The pseudoinverse might not be differentiable for all input matrices, especially those with varying rank. This can affect backpropagation in neural networks.
  • rcond (optional)
    A tolerance parameter (default: 1e-15) that controls how small singular values are considered negligible for the calculation. Smaller values might lead to more accurate inverses for ill-conditioned matrices, but can also increase computational cost.


Solving a system of linear equations with a non-square matrix

import torch

# Non-square matrix
A = torch.tensor([[1, 2, 3], [4, 5, 6]])
b = torch.tensor([7, 8])  # Right-hand side vector

# Pseudoinverse
A_pinv = torch.linalg.pinv(A)

# Solve Ax = b using the pseudoinverse
x = torch.mm(A_pinv, b)  # Matrix multiplication

print("Original matrix A:\n", A)
print("Right-hand side vector b:\n", b)
print("Solution x using pseudoinverse:\n", x)

This code defines a non-square matrix A and a right-hand side vector b. It then calculates the pseudoinverse of A and uses it to solve the system of linear equations Ax = b. The solution vector x is obtained using matrix multiplication.

Least squares solution with a singular matrix

import torch

# Singular matrix (has a zero singular value)
A = torch.tensor([[1, 2], [2, 4]])
b = torch.tensor([5, 10])

# Pseudoinverse with smaller tolerance (rcond) for better conditioning
A_pinv = torch.linalg.pinv(A, rcond=1e-8)

# Least squares solution
x = torch.mm(A_pinv, b)

print("Original (singular) matrix A:\n", A)
print("Right-hand side vector b:\n", b)
print("Least squares solution x:\n", x)

Here, we define a singular matrix A and a right-hand side vector b. We use a smaller rcond value to account for the singular nature of A and obtain a more accurate least squares solution using the pseudoinverse.



torch.linalg.lstsq

  • It returns both the solution x and a residuals vector that can be useful for analyzing the quality of the fit.
  • This function is often a better choice for solving least squares problems. It provides a more efficient and numerically stable solution compared to explicitly computing the pseudoinverse.
import torch

A = torch.tensor([[1, 2], [3, 4]])
b = torch.tensor([5, 7])

solution, residuals = torch.linalg.lstsq(A, b)

print("Solution x:\n", solution)
print("Residuals:\n", residuals)

QR decomposition

  • This method can be slightly more efficient for certain matrix structures compared to SVD-based approaches.
  • If A has full column rank, the pseudoinverse can be computed as R.inverse() @ Q.transpose().
  • This approach involves decomposing the matrix A into a QR decomposition (Q, R).
import torch

A = torch.tensor([[1, 2], [3, 4]])
Q, R = torch.qr(A)

if torch.linalg.matrix_rank(A) == A.shape[1]:  # Check for full column rank
    A_pinv = torch.inverse(R) @ Q.t()
    print("Pseudoinverse using QR decomposition:\n", A_pinv)
else:
    print("QR decomposition might not be suitable in this case (not full column rank)")

Custom SVD implementation (advanced)

  • However, this approach is more complex and requires a deeper understanding of linear algebra.
  • This allows for fine-grained control over the decomposition process and can potentially be more efficient for specific matrix structures.
  • In some specific scenarios, implementing your own SVD algorithm tailored to your needs might be desirable.
  • Custom SVD should be reserved for very specific use cases requiring advanced control.
  • QR decomposition might be considered for matrices with full column rank if computational efficiency is a concern.
  • torch.linalg.lstsq is generally recommended for least squares problems due to its efficiency and stability.