Demystifying NumPy's random.random_sample(): Unleashing the Power of Random Sampling


random.random_sample() in NumPy

The random.random_sample() function from NumPy's random module is a workhorse for generating arrays of random floating-point numbers. It's essential for various tasks involving random sampling, simulations, and statistical computations.

What it Does

  • Returns a NumPy array containing the random floats. The size and shape of the array are determined by the size parameter (explained below).
  • Generates random samples in the half-open interval [0.0, 1.0). This means the generated numbers will be greater than or equal to 0.0, but strictly less than 1.0.

Syntax

import numpy as np

random_samples = np.random.random_sample(size)

Parameters

  • size (int or tuple of ints, optional): This parameter specifies the shape of the output array.
    • If size is an integer, the function returns a 1D array of that length.
    • If size is a tuple of integers, the function returns a multidimensional array with the corresponding shape.

Example

import numpy as np

# Generate 10 random samples (1D array)
samples1 = np.random.random_sample(10)
print(samples1)

# Generate a 3x4 array of random samples
samples2 = np.random.random_sample((3, 4))
print(samples2)

Output

[0.51639727 0.94556406 0.01780613 0.57543933 0.54618635 0.57065203
 0.49474938 0.7607941  0.20532795 0.57330641]
[[0.72345  0.12345  0.87654  0.98765]
 [0.23456  0.56789  0.12345  0.78901]
 [0.45678  0.89012  0.34567  0.12345]]
  • For generating samples from other probability distributions (e.g., normal, uniform within a specific range), NumPy provides other random sampling functions like rand, randn, randint, and more.
  • The random numbers generated by random.random_sample() are pseudo-random, meaning they are deterministic based on a seed value. To get true randomness, consider using np.random.seed() to set the seed before generating samples.


Generating Random Samples Within a Specific Range

import numpy as np

# Generate 5 random samples between 5 and 10 (uniform distribution)
lower_bound = 5
upper_bound = 10
samples = (upper_bound - lower_bound) * np.random.random_sample(5) + lower_bound
print(samples)

This code first defines the lower and upper bounds of the desired range. Then, it multiplies the random samples from random.random_sample() by the difference between the bounds (upper_bound - lower_bound) and adds the lower bound (lower_bound) to shift the range.

Creating a Random Permutation of an Array

import numpy as np

# Create an array
original_array = np.arange(10)

# Randomly shuffle the elements (permutation)
shuffled_array = np.random.permutation(original_array)
print("Original:", original_array)
print("Shuffled:", shuffled_array)

This code creates an array, then uses np.random.permutation() to generate a random permutation of the indices. This permutation is then used to shuffle the original array elements.

Sampling from a Discrete Uniform Distribution

random.random_sample() can be used indirectly to sample from a discrete uniform distribution (picking integers within a range with equal probability) by combining it with np.floor or np.ceil:

import numpy as np

# Sample 3 integers between 1 and 6 (inclusive)
lower_bound = 1
upper_bound = 6  # Make sure upper_bound is inclusive
samples = np.floor(lower_bound + (upper_bound - lower_bound + 1) * np.random.random_sample(3))
print(samples.astype(int))  # Convert to integers

This code defines the lower and upper bounds for the discrete range. It calculates the number of possible integers (inclusive) and uses that to scale the random samples. Finally, np.floor is used to round down the results to get integer values.



Generating Samples from Other Distributions

  • np.random.choice(a, size=None, replace=True, p=None): Samples elements from a 1D array a with replacement (elements can be chosen multiple times) or without replacement. Optionally, you can provide probabilities p for weighted sampling.
  • np.random.randint(low, high, size): Generates random integers from a uniform discrete distribution within the specified range [low, high) (exclusive on the upper bound).
  • np.random.randn(): Generates samples from a standard normal distribution (mean 0, standard deviation 1).
  • np.random.rand(): Similar to random.random_sample(), but returns samples from a uniform distribution over [0.0, 1.0) (inclusive on both ends).

Shuffling Elements

  • np.random.shuffle(arr): In-place shuffling of an array arr.

Specific Use Cases

  • For more specialized distributions or sampling techniques, refer to NumPy's extensive random sampling functions like np.random.beta, np.random.binomial, np.random.exponential, and more (see the documentation for details).

Choosing the Right Alternative

The best alternative depends on the type of random samples you need:

  • For shuffling elements, use np.random.shuffle().
  • If you need samples from a different distribution (normal, uniform within a specific range, discrete integers), use the appropriate function like np.random.randn(), np.random.randint(), etc.
  • If you need samples from a uniform distribution between 0 (inclusive) and 1 (exclusive), random.random_sample() is a good choice.
  • random.random_sample() and other random number generation functions in NumPy are pseudo-random. They use a seed value to generate deterministic sequences. For true randomness, consider using np.random.seed() to set a seed before generating samples.