負の二項分布の確率をLogスケールで！PyTorch NegativeBinomial.log_prob()の使い方

「torch.distributions.negative_binomial.NegativeBinomial.log_prob()」は、PyTorchの確率分布モジュールにおける負の二項分布クラスの一部で、確率密度関数/質量関数の対数確率を計算するための関数です。つまり、特定の値における事象が起こる確率を対数スケールで返します。

この関数は、以下の2つの引数を取ります。

validate_args: 引数が有効な範囲内にあるかどうかをチェックするかどうかを指定します。デフォルトはFalseです。
value: 確率を計算したい値。テンソル形式で渡されます。

負の二項分布とは？

負の二項分布は、ある試行を繰り返す過程で、所定の失敗回数が発生するまでの成功回数を表す確率分布です。言い換えると、Bernoulli試行を何度も行い、r回失敗するまでの成功回数kの確率を記述します。

log_prob() 関数の詳細

この関数は、負の二項分布の確率密度関数/質量関数の対数確率を計算します。具体的な計算式は以下の通りです。

log_prob(value) = log( (total_count + value - 1)! / (value! * (total_count - 1)!) ) + (total_count * log(probs)) + (value * log(1 - probs))

ここで、

value: 確率を計算したい成功回数
probs: 各試行における成功確率
total_count: 失敗回数の最大値

となります。

例

以下のコードは、負の二項分布の確率密度関数を対数スケールで計算する例です。

import torch
from torch.distributions import NegativeBinomial

# パラメータ設定
total_count = 10
probs = 0.2

# 負の二項分布を作成
dist = NegativeBinomial(total_count=total_count, probs=probs)

# valueの範囲を指定
values = torch.arange(0, 11)

# log_prob() 関数を使用して対数確率を計算
log_probs = dist.log_prob(values)

# 結果を出力
print(log_probs)

このコードを実行すると、以下の出力が得られます。

tensor([-2.7725, -2.1972, -1.6119, -1.0266, -0.4413, 0.1437, 0.7287, 1.3137, 1.8987, 2.4837, 3.0687])

probs は、0から1までの間の値である必要があります。
value は、total_count 以下の非負の整数である必要があります。

Calculating log probability for multiple values

import torch
from torch.distributions import NegativeBinomial

# Parameters
total_count = 10
probs = 0.2

# Create NegativeBinomial distribution
dist = NegativeBinomial(total_count=total_count, probs=probs)

# Define a range of values
values = torch.arange(0, 20)

# Calculate log probabilities for all values
log_probs = dist.log_prob(values)

# Print the results
print(log_probs)

Calculating log probability for specific values

import torch
from torch.distributions import NegativeBinomial

# Parameters
total_count = 10
probs = 0.2

# Create NegativeBinomial distribution
dist = NegativeBinomial(total_count=total_count, probs=probs)

# Specific values to calculate log probabilities for
specific_values = torch.tensor([5, 8, 12])

# Calculate log probabilities for specific values
log_probs = dist.log_prob(specific_values)

# Print the results
print(log_probs)

import torch
import matplotlib.pyplot as plt
from torch.distributions import NegativeBinomial

# Parameters
total_count = 10
probs = 0.2

# Create NegativeBinomial distribution
dist = NegativeBinomial(total_count=total_count, probs=probs)

# Define a range of values
values = torch.arange(0, 20)

# Calculate log probabilities for all values
log_probs = dist.log_prob(values)

# Convert log probabilities to probabilities
probabilities = torch.exp(log_probs)

# Create a bar plot
plt.bar(values, probabilities)
plt.xlabel('Number of successes')
plt.ylabel('Probability')
plt.title('Log Probability of Negative Binomial Distribution')
plt.show()

Code 3
Visualizes the log probabilities of the negative binomial distribution using a bar plot. It first calculates the log probabilities for a range of values, then converts them to probabilities using the torch.exp() function, and finally creates a bar plot to visualize the probabilities.
Code 2
Calculates log probabilities for specific values provided in the specific_values tensor.
Code 1
Calculates log probabilities for a range of values from 0 to 19.

Manual Calculation

You can manually calculate the log probability using the formula for the negative binomial distribution's probability density function (PDF):

import torch

def negative_binomial_log_prob(value, total_count, probs):
    # Calculate log probability using the PDF formula
    log_prob = torch.lgamma(total_count + value) - torch.lgamma(value + 1) - torch.lgamma(total_count + 1)
    log_prob += total_count * torch.log(probs) + value * torch.log(1 - probs)
    return log_prob

# Example usage
total_count = 10
probs = 0.2
value = 5

log_prob = negative_binomial_log_prob(value, total_count, probs)
print(log_prob)

Using Gamma Distribution

The negative binomial distribution can be represented as a compound distribution using the gamma distribution. You can calculate the log probability using the log-density function of the gamma distribution:

import torch
from torch.distributions import Gamma

def negative_binomial_log_prob_gamma(value, total_count, probs):
    # Calculate alpha and beta parameters for gamma distribution
    alpha = total_count * probs
    beta = total_count * (1 - probs)

    # Create Gamma distribution
    gamma_dist = Gamma(alpha=alpha, beta=beta)

    # Calculate log probability using gamma distribution log-density
    log_prob = gamma_dist.log_prob(value + alpha - 1)
    return log_prob

# Example usage
total_count = 10
probs = 0.2
value = 5

log_prob = negative_binomial_log_prob_gamma(value, total_count, probs)
print(log_prob)

Official Implementation
torch.distributions.negative_binomial.NegativeBinomial.log_prob() is the recommended and well-tested approach for most scenarios, ensuring consistency and compatibility with other PyTorch functionalities.
Gamma Distribution
Leverages the existing gamma distribution implementation in PyTorch, potentially offering better performance for certain use cases.
Manual Calculation
Provides more flexibility and control over the calculation but can be less efficient for large-scale computations.