Delving into LKJCholesky.sample(): Probability Distributions and Correlation Matrices in PyTorch


Functionality

  • The LKJ distribution is particularly useful for generating correlation matrices, which are essential in various statistical applications, including modeling relationships between variables.
  • This function samples a lower Cholesky factor of a correlation matrix from the LKJ (Lewandowski, Kurowicka, and Joe) distribution.

Parameters

  • concentration (float or torch.Tensor): The concentration parameter (eta) that controls the shape of the distribution.
    • Higher values of concentration lead to correlation matrices with stronger dependencies between variables (more clustered towards 1 or -1).
    • A value of concentration=1 corresponds to a uniform distribution over Cholesky factors of correlation matrices.
  • dim (int): The dimension of the square correlation matrices to be sampled.

Return Value

  • A torch.Tensor representing a random lower Cholesky factor of a correlation matrix.
    • The shape of the returned tensor is (sample_shape[0], dim, dim), where sample_shape is an optional argument that determines the number of independent samples to draw.

Implementation Details

  • The sampling process typically employs the "onion method" described in the reference paper by Lewandowski et al. [1]. This method iteratively constructs the Cholesky factor by generating elements from specific probability distributions.
  • The LKJCholesky class inherits from torch.distributions.Distribution and implements the sample() method.

Key Points

  • PyTorch's LKJCholesky implementation closely follows the one found in NumPyro [2].
  • LKJCholesky.sample() directly samples Cholesky factors, not correlation matrices themselves. This distinction is important because Cholesky factors ensure positive definiteness, a crucial property for correlation matrices.

Example Usage

import torch
from torch.distributions import lkj_cholesky

# Set dimension and concentration parameter
dim = 3
concentration = 0.5

# Create an LKJCholesky distribution
lkj = lkj_cholesky.LKJCholesky(dim, concentration)

# Sample a lower Cholesky factor
sample = lkj.sample()

# Calculate the corresponding correlation matrix (optional)
correlation_matrix = torch.matmul(sample, sample.transpose(-1, -2))
print(correlation_matrix)

This code will generate a random lower Cholesky factor of a 3x3 correlation matrix with a concentration parameter of 0.5. You can further compute the correlation matrix itself if needed.



Sampling Multiple Correlation Matrices

import torch
from torch.distributions import lkj_cholesky

# Set dimension and concentration parameter
dim = 5
concentration = 2.0

# Number of samples to draw
num_samples = 3

# Create an LKJCholesky distribution
lkj = lkj_cholesky.LKJCholesky(dim, concentration)

# Sample multiple lower Cholesky factors (batch of samples)
samples = lkj.sample((num_samples,))  # Specify sample_shape

# Print the shapes of the sampled matrices
print(samples.shape)

This code modifies the previous example to draw a batch of num_samples independent lower Cholesky factors. The resulting samples tensor will have a shape of (num_samples, dim, dim).

Using a Different Concentration Parameter

import torch
from torch.distributions import lkj_cholesky

# Set dimension
dim = 4

# Sample with two different concentration parameters
concentration1 = 0.1
concentration2 = 5.0

# Create LKJCholesky distributions
lkj1 = lkj_cholesky.LKJCholesky(dim, concentration1)
lkj2 = lkj_cholesky.LKJCholesky(dim, concentration2)

# Sample lower Cholesky factors
sample1 = lkj1.sample()
sample2 = lkj2.sample()

# Print the samples to observe the impact of concentration
print(sample1)
print(sample2)

This code showcases how the concentration parameter (eta) affects the sampled Cholesky factors. With a lower value (eta=0.1), the elements are closer to zero, indicating weaker dependencies. A higher value (eta=5.0) tends to produce elements closer to 1 or -1, suggesting stronger correlations.

Combining with Other PyTorch Operations

import torch
from torch.distributions import lkj_cholesky

# Set dimension and concentration parameter
dim = 2
concentration = 1.0

# Create an LKJCholesky distribution
lkj = lkj_cholesky.LKJCholesky(dim, concentration)

# Sample a lower Cholesky factor
sample = lkj.sample()

# Apply element-wise operations (e.g., scaling)
scaled_sample = sample * 2.0  # Scale each element by 2

# Print the original and scaled samples
print(sample)
print(scaled_scale)

This example demonstrates how you can combine LKJCholesky.sample() with other PyTorch operations. Here, the sampled Cholesky factor is scaled element-wise by 2. You can adapt this approach for various transformations as needed.



Eigenvalue Decomposition with Random Eigenvectors

  • This approach involves the following steps:
    1. Generate random orthogonal matrices using techniques like random rotations or QR decomposition. You can leverage libraries like NumPy or SciPy for these tasks.
    2. Create a diagonal matrix with random eigenvalues on the diagonal (positive for positive definite correlation matrices).
    3. Combine the random orthogonal matrix and the diagonal matrix of eigenvalues using matrix multiplication to obtain the Cholesky factor.

Code Example (using NumPy)

import numpy as np

def random_correlation_matrix(dim):
  # Generate random orthogonal matrix (e.g., using QR decomposition)
  Q, _ = np.linalg.qr(np.random.rand(dim, dim))

  # Create diagonal matrix with random positive eigenvalues
  eigenvalues = np.random.rand(dim) + 1  # Ensure positive values
  diag_eigenvalues = np.diag(eigenvalues)

  # Combine for Cholesky factor
  L = np.matmul(Q, np.sqrt(diag_eigenvalues))
  return L

Wishart Distribution

  • The Wishart distribution is another option for generating positive definite matrices. PyTorch offers the torch.distributions.wishart.Wishart class for this purpose. However, it requires specifying the degrees of freedom and a scale matrix, which might not be as intuitive for generating correlation matrices directly.

Code Example (using PyTorch)

import torch
from torch.distributions import wishart

# Set dimension and degrees of freedom
dim = 3
df = 5

# Define a random scale matrix (adjust as needed)
scale_matrix = torch.eye(dim)

# Create Wishart distribution
wishart_dist = wishart.Wishart(df, scale_matrix)

# Sample a positive definite matrix
sample = wishart_dist.sample()  # Shape: (..., dim, dim)

# Normalize to get correlation matrix (optional)
correlation_matrix = sample / torch.trace(sample, dim1=1, dim2=2)  # Normalize by trace
print(correlation_matrix)
  • The choice also depends on your familiarity with libraries like NumPy and the specific requirements of your application.
  • For more flexibility in generating random orthogonal matrices or using a Wishart distribution, consider the first and second methods, respectively.
  • If you need a straightforward way to generate correlation matrices and control the strength of dependencies through the concentration parameter, LKJCholesky.sample() is a good choice.