Unlocking Numerical Stability: Exploring Logits and torch.Tensor.logit
Understanding Logits in PyTorch
- ** logits (tensor) -> tensor:** This attribute is not actually a method, but a property of a
torch.Tensor
object in PyTorch. It's used to convert a tensor containing raw numerical values (often probabilities within the range 0 to 1) into their corresponding log-odds (logits) representation.
Why Use Logits?
- Logits are more numerically stable for calculations involving probabilities, especially when dealing with very small or very large values. This is because the log-odds transformation maps probabilities closer to negative and positive infinity, making them less prone to underflow or overflow errors during computations.
How Does torch.Tensor.logit
Work?
- It applies the natural logarithm (ln) to the input tensor element-wise, followed by a division by
ln(10)
(or approximately 2.302585). This transformation brings the values closer to the range of negative and positive infinity, improving numerical stability.
Example
import torch
# Sample probability tensor
probs = torch.tensor([0.1, 0.7, 0.2])
# Calculate logits
logits = torch.log(probs) / torch.log(torch.tensor(10)) # Equivalent to probs.logit()
print(logits) # Output: tensor([-2.3025, 0.3567, -1.3863])
Key Points
- The output
logits
will also have the same data type. probs
can have any data type that supports logarithms (typicallyfloat
).
PyTorch provides other loss functions that handle the conversion from logits to probabilities internally (e.g.,
torch.nn.BCELoss
).While
torch.Tensor.logit
is useful, it's often used in conjunction with thetorch.nn.functional.sigmoid
function (or its aliastorch.sigmoid
) to convert logits back into probabilities for interpretation:import torch.nn.functional as F probabilities = F.sigmoid(logits) print(probabilities) # Output: tensor([0.0900, 0.5744, 0.2369])
Logistic Regression with Logits
This example shows how logit
is used in a simple logistic regression model:
import torch
import torch.nn as nn
# Define logistic regression model
class LogisticRegression(nn.Module):
def __init__(self, input_size, output_size):
super(LogisticRegression, self).__init__()
self.linear = nn.Linear(input_size, output_size)
def forward(self, x):
logits = self.linear(x)
return logits
# Sample data and labels
x = torch.randn(10, 5) # 10 data points, each with 5 features
y = torch.randint(0, 2, (10,)) # Binary labels (0 or 1)
# Create model, loss function, and optimizer
model = LogisticRegression(5, 1)
criterion = nn.BCELoss() # Binary cross-entropy loss (works with logits)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
# Training loop (simplified)
for epoch in range(10):
# Forward pass
logits = model(x)
# Calculate loss (no need for sigmoid here)
loss = criterion(logits, y.float())
# Backward pass and update weights
optimizer.zero_grad()
loss.backward()
optimizer.step()
# After training, predict probabilities using sigmoid
predictions = torch.sigmoid(model(x))
Custom Loss Function with Logits
This example shows how you can define a custom loss function that incorporates logit
for numerical stability:
import torch
def custom_loss(logits, y):
# Implement your custom loss function here
# (e.g., weighted binary cross-entropy)
probs = torch.sigmoid(logits)
loss = - (y * torch.log(probs) + (1 - y) * torch.log(1 - probs))
return loss
# Usage example:
logits = torch.tensor([0.5, 1.0, -2.0])
y = torch.tensor([1, 0, 1])
loss_value = custom_loss(logits, y)
Here, the custom loss function calculates probabilities using sigmoid
and then uses logarithms on those probabilities for a more stable calculation.
Manual Calculation
You can manually implement the logit transformation using the following formula:
import torch
def logits_from_probs(probs):
"""Converts probabilities to logits."""
return torch.log(probs) / torch.log(torch.tensor(10)) # Equivalent to ln(probs) / ln(10)
# Example usage
probs = torch.tensor([0.3, 0.7])
logits = logits_from_probs(probs)
print(logits) # Output: tensor([-1.2042, 0.3567])
This approach offers fine-grained control over the calculation but might be less concise.
torch.log and Division
While not technically an alternative method, you can achieve the same result by combining torch.log
and division:
import torch
probs = torch.tensor([0.2, 0.8])
logits = torch.log(probs) / torch.log(torch.tensor(10)) # Equivalent to probs.logit()
# Alternative with separate calculations
logits_alt = torch.log(probs) / 2.302585 # Natural logarithm constant
print(logits) # Output: tensor([-1.3863, 0.3567])
print(logits_alt) # Output: tensor([-1.3863, 0.3567]) (same result)
This approach uses separate steps but produces the same outcome as probs.logit()
.
- Customization
If you need to modify the log-odds transformation (e.g., using a different base for the logarithm), manual calculation provides more control. - Readability
For clarity and maintaining PyTorch coding style, usingprobs.logit()
is generally preferred.