Exploring Alternatives to finfo.tiny in NumPy: When Customization Matters


What it is

  • finfo.tiny specifically represents the smallest positive representable number that is considered a "normal" number in the chosen floating-point type.
  • numpy.finfo() provides information about the machine limits for various floating-point data types supported by NumPy.
  • finfo.tiny (also accessible as smallest_normal) is a property within the object returned by the numpy.finfo() function.

Key Points

  • IEEE 754 compliance
    NumPy's floating-point representation generally adheres to the IEEE 754 standard, which defines how floating-point numbers are stored and manipulated in computers.
  • Not the absolute smallest
    It's important to note that finfo.tiny is not the absolute smallest positive number that can be represented. Some floating-point types use subnormal numbers to fill the gap between 0 and finfo.tiny, but these subnormal numbers may have reduced precision.

How to Use It

  1. Import NumPy:

    import numpy as np
    
  2. Call numpy.finfo() for a specific floating-point data type (e.g., float32, float64):

    float_info = np.finfo(np.float32)
    
  3. Access finfo.tiny to get the smallest positive normal number:

    smallest_normal_value = float_info.tiny
    print(smallest_normal_value)  # Output: 1.17549435e-38 (example for float32)
    

Why It's Useful

  • You can use it to:
    • Set a lower bound for calculations to avoid underflow (overflowing in the negative direction).
    • Compare the magnitude of a number to finfo.tiny to determine if it's effectively zero for your purposes.
  • finfo.tiny is helpful when dealing with very small numbers in your calculations.
import numpy as np

def is_effectively_zero(x, tolerance=np.finfo(float64).tiny):
  """Checks if a number is effectively zero within a given tolerance."""
  return abs(x) < tolerance

# Usage
number = 1e-15
if is_effectively_zero(number):
  print(number, "is effectively zero.")
else:
  print(number, "is not effectively zero.")


Setting a lower bound for calculations

import numpy as np

def safe_division(x, y):
  """Performs division with a check for division by zero or very small numbers."""
  tolerance = np.finfo(float64).tiny
  if abs(y) < tolerance:
    raise ZeroDivisionError("Division by zero or very small number")
  return x / y

# Usage
a = 1
b = np.finfo(float64).tiny * 10  # Slightly larger than tiny

try:
  result = safe_division(a, b)
  print(result)
except ZeroDivisionError as e:
  print(e)

This code defines a safe_division function that raises a ZeroDivisionError if the denominator is either exactly zero or very close to zero (as defined by tolerance). This helps prevent potential errors and unexpected behavior in computations.

Handling small residuals in linear least squares

import numpy as np

def least_squares(A, b):
  """Solves a linear least squares problem, handling small residuals."""
  x, residuals, rank, s = np.linalg.lstsq(A, b, rcond=None)
  tolerance = np.finfo(float64).tiny * np.linalg.norm(b)
  large_residuals = residuals[residuals > tolerance]
  if len(large_residuals) > 0:
    print("Warning: Some residuals are relatively large:", large_residuals)
  return x

# Usage
A = np.array([[1, 2], [3, 4]])
b = np.array([5, 7])
solution = least_squares(A, b)
print("Solution:", solution)

In this example, least_squares solves a linear least squares problem using np.linalg.lstsq. It then checks the residuals (the difference between the actual values and the model's predictions) against a tolerance based on finfo.tiny and the norm of the target vector b. If any residuals are significantly larger than this tolerance, a warning is printed. This helps identify potential issues with the model fit.

Comparing magnitudes of numbers for custom logic

import numpy as np

def prioritize_large_values(values):
  """Prioritizes elements in a list with magnitudes larger than a threshold."""
  tolerance = np.finfo(float64).tiny * 100
  large_values = [v for v in values if abs(v) > tolerance]
  small_values = [v for v in values if abs(v) <= tolerance]
  return large_values + small_values

# Usage
data = [100, 0.0001, -2.5, np.finfo(float64).tiny * 5]
prioritized_data = prioritize_large_values(data)
print("Prioritized data:", prioritized_data)

This code defines a prioritize_large_values function that separates a list of values into two categories: those with magnitudes larger than a threshold and those smaller or equal to the threshold. The threshold is set based on finfo.tiny and a scaling factor. This can be useful for tasks like focusing on significant values in an analysis or filtering out very small noise.



np.nextafter(0., 1.)

  • This function returns the machine epsilon, which is the smallest positive number that can be added to 1.0 and still produce a different result from 1.0. While not exactly the same as finfo.tiny, it provides a close approximation to the smallest representable positive value.

Custom tolerance based on problem scale

  • In some cases, you might need a more tailored approach. You can define your own tolerance level based on the scale of the numbers you're working with. This might involve calculating the expected range of your data and setting a threshold as a fraction of that range.
  • If you need absolute theoretical guarantees about the smallest representable number, consider using symbolic computation libraries like SymPy. These libraries work with exact representations of numbers, avoiding the limitations of floating-point arithmetic. However, they may be slower and less efficient for large-scale computations.
AlternativeDescriptionAdvantagesDisadvantages
np.nextafter(0., 1.)Closest positive number to zeroConvenient, readily available in NumPyNot exactly the same as finfo.tiny
Custom tolerance based on scaleUser-defined threshold based on problem contextAdaptable to specific needsRequires additional logic to calculate the tolerance
Symbolic computation librariesExact representation of numbersGuarantees about smallest representable numberSlower, less efficient for large-scale computations