Unlocking Numerical Stability: Exploring Logits and torch.Tensor.logit


Understanding Logits in PyTorch

  • ** logits (tensor) -> tensor:** This attribute is not actually a method, but a property of a torch.Tensor object in PyTorch. It's used to convert a tensor containing raw numerical values (often probabilities within the range 0 to 1) into their corresponding log-odds (logits) representation.

Why Use Logits?

  • Logits are more numerically stable for calculations involving probabilities, especially when dealing with very small or very large values. This is because the log-odds transformation maps probabilities closer to negative and positive infinity, making them less prone to underflow or overflow errors during computations.

How Does torch.Tensor.logit Work?

  • It applies the natural logarithm (ln) to the input tensor element-wise, followed by a division by ln(10) (or approximately 2.302585). This transformation brings the values closer to the range of negative and positive infinity, improving numerical stability.

Example

import torch

# Sample probability tensor
probs = torch.tensor([0.1, 0.7, 0.2])

# Calculate logits
logits = torch.log(probs) / torch.log(torch.tensor(10))  # Equivalent to probs.logit()
print(logits)  # Output: tensor([-2.3025,  0.3567, -1.3863])

Key Points

  • The output logits will also have the same data type.
  • probs can have any data type that supports logarithms (typically float).
  • PyTorch provides other loss functions that handle the conversion from logits to probabilities internally (e.g., torch.nn.BCELoss).

  • While torch.Tensor.logit is useful, it's often used in conjunction with the torch.nn.functional.sigmoid function (or its alias torch.sigmoid) to convert logits back into probabilities for interpretation:

    import torch.nn.functional as F
    
    probabilities = F.sigmoid(logits)
    print(probabilities)  # Output: tensor([0.0900, 0.5744, 0.2369])
    


Logistic Regression with Logits

This example shows how logit is used in a simple logistic regression model:

import torch
import torch.nn as nn

# Define logistic regression model
class LogisticRegression(nn.Module):
    def __init__(self, input_size, output_size):
        super(LogisticRegression, self).__init__()
        self.linear = nn.Linear(input_size, output_size)

    def forward(self, x):
        logits = self.linear(x)
        return logits

# Sample data and labels
x = torch.randn(10, 5)  # 10 data points, each with 5 features
y = torch.randint(0, 2, (10,))  # Binary labels (0 or 1)

# Create model, loss function, and optimizer
model = LogisticRegression(5, 1)
criterion = nn.BCELoss()  # Binary cross-entropy loss (works with logits)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Training loop (simplified)
for epoch in range(10):
    # Forward pass
    logits = model(x)
    # Calculate loss (no need for sigmoid here)
    loss = criterion(logits, y.float())

    # Backward pass and update weights
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

# After training, predict probabilities using sigmoid
predictions = torch.sigmoid(model(x))

Custom Loss Function with Logits

This example shows how you can define a custom loss function that incorporates logit for numerical stability:

import torch

def custom_loss(logits, y):
    # Implement your custom loss function here
    # (e.g., weighted binary cross-entropy)
    probs = torch.sigmoid(logits)
    loss = - (y * torch.log(probs) + (1 - y) * torch.log(1 - probs))
    return loss

# Usage example:
logits = torch.tensor([0.5, 1.0, -2.0])
y = torch.tensor([1, 0, 1])
loss_value = custom_loss(logits, y)

Here, the custom loss function calculates probabilities using sigmoid and then uses logarithms on those probabilities for a more stable calculation.



Manual Calculation

You can manually implement the logit transformation using the following formula:

import torch

def logits_from_probs(probs):
  """Converts probabilities to logits."""
  return torch.log(probs) / torch.log(torch.tensor(10))  # Equivalent to ln(probs) / ln(10)

# Example usage
probs = torch.tensor([0.3, 0.7])
logits = logits_from_probs(probs)
print(logits)  # Output: tensor([-1.2042, 0.3567])

This approach offers fine-grained control over the calculation but might be less concise.

torch.log and Division

While not technically an alternative method, you can achieve the same result by combining torch.log and division:

import torch

probs = torch.tensor([0.2, 0.8])
logits = torch.log(probs) / torch.log(torch.tensor(10))  # Equivalent to probs.logit()

# Alternative with separate calculations
logits_alt = torch.log(probs) / 2.302585  # Natural logarithm constant
print(logits)  # Output: tensor([-1.3863, 0.3567])
print(logits_alt)  # Output: tensor([-1.3863, 0.3567]) (same result)

This approach uses separate steps but produces the same outcome as probs.logit().

  • Customization
    If you need to modify the log-odds transformation (e.g., using a different base for the logarithm), manual calculation provides more control.
  • Readability
    For clarity and maintaining PyTorch coding style, using probs.logit() is generally preferred.