Variance in Multinomial Distributions: A Look at PyTorch Implementation


Multinomial Distribution and Variance

The multinomial distribution represents a scenario where you have a fixed number of trials (total_count), and each trial results in one of several possible categories with specific probabilities (probs). The variance property calculates the variance of the number of successes (samples) in each category across those trials.

Calculation of Variance

The variance for the multinomial distribution is computed using the following formula:

variance = total_count * probs * (1 - probs)

Breakdown of the Formula

  • (1 - probs): This term represents the probability of not getting a success in a particular category.
  • probs: This is a tensor representing the probability of each category. It's assumed to be normalized (sum of probabilities equals 1).
  • total_count: This is the total number of trials or samples you're performing.

Interpretation of Variance

The variance tells you how much the number of successes in each category deviates from the expected value (mean). A higher variance indicates a wider spread of possible outcomes, meaning the number of successes in each category can vary more significantly across trials. Conversely, a lower variance signifies that the number of successes is more consistent across trials.

Accessing Variance in PyTorch

The variance property is directly accessible from an instance of the Multinomial class:

import torch
from torch.distributions import Multinomial

total_count = 100
probs = torch.tensor([0.4, 0.3, 0.3])  # Probabilities for 3 categories

m = Multinomial(total_count, probs)
variance = m.variance

print(variance)  # Output: tensor([ 24.  21.  21.])

This code creates a multinomial distribution with 100 trials and probabilities for 3 categories. It then calculates the variance using the variance property. The output shows the variance for each category, indicating how much the number of successes might deviate from the expected value (40, 30, 30) across trials.

  • Higher variance suggests more variation, while lower variance indicates more consistency.
  • It helps assess the variability of the number of successes in each category.
  • The variance property provides information about the spread of outcomes in a multinomial distribution.


Example 1: Comparing Variance with Different Probabilities

This code shows how the variance changes with different probability distributions for 3 categories:

import torch
from torch.distributions import Multinomial

total_count = 100

# Uniform probabilities
probs1 = torch.ones(3) / 3

# Unequal probabilities
probs2 = torch.tensor([0.7, 0.2, 0.1])

m1 = Multinomial(total_count, probs1)
m2 = Multinomial(total_count, probs2)

variance1 = m1.variance
variance2 = m2.variance

print("Variance with uniform probabilities:", variance1)
print("Variance with unequal probabilities:", variance2)

This code outputs something like:

Variance with uniform probabilities: tensor([ 33.3333,  33.3333,  33.3333])
Variance with unequal probabilities: tensor([ 10.  49.  81.])

As expected, the variance with uniform probabilities is the same for all categories (higher spread). In contrast, the unequal probabilities lead to higher variance for categories with lower probabilities (more concentrated around the expected value).

Example 2: Using Variance for Decision Making

This example illustrates how variance can be used to guide decision-making in a multinomial setting:

import torch
from torch.distributions import Multinomial

total_count = 1000
probs = torch.tensor([0.8, 0.15, 0.05])  # Category 1 most likely

m = Multinomial(total_count, probs)
variance = m.variance

# If low variance is preferred for stability
if variance.max() < 50:
    print("Proceed with multinomial distribution due to low variance")
else:
    print("Consider alternative approach due to high variance")

This code checks if the maximum variance across categories is below a certain threshold (indicating a more stable distribution). If so, it suggests proceeding with the multinomial distribution. Otherwise, it might be better to explore other options if high variance is undesirable.



Manual Calculation

If you need more control over the variance calculation or want to customize it for specific use cases, you can compute it manually using the formula:

import torch

total_count = torch.tensor(100)
probs = torch.tensor([0.4, 0.3, 0.3])

variance = total_count * probs * (1 - probs)

print(variance)

This approach gives you complete control over the formula and allows for potential modifications.

Variance of Sampled Data

If you already have samples drawn from the multinomial distribution and want to estimate the variance of the underlying distribution, you can use the torch functions var or std after converting the samples to a tensor:

import torch
from torch.distributions import Multinomial

total_count = 100
probs = torch.tensor([0.4, 0.3, 0.3])

m = Multinomial(total_count, probs)
samples = m.sample((1000,))  # Sample 1000 times

variance = torch.var(samples, dim=0)  # Variance across categories
std = torch.std(samples, dim=0)  # Standard deviation

print(variance)
print(std)

This approach provides an estimate of the variance based on the actual samples, which can be useful for understanding real-world behavior.

Alternative Distributions

If the multinomial distribution doesn't perfectly match your needs and variance is a critical aspect, consider exploring other distributions in PyTorch's distributions module that might better suit your scenario. Some potential alternatives include:

  • Beta Distribution
    This distribution is suitable for modeling probabilities within a specific range (0 to 1), and its parameters can be tuned to control the variance of the outcomes.
  • Dirichlet Distribution
    This distribution models probability distributions over categories, and its parameters can influence the variance of the resulting samples.

The choice of the best alternative depends on your specific problem and the desired properties of the distribution.