[SOLVED] Sample Random values from a density distribution

Issue

enter image description here

Hi all,
I am trying to sample random values from between 0 and 1, with weights provided by datases like the one above. I have found a partial solution to this problem using scipy.stats.gaussian_kde and its .resample(n) method. My main issue is that, because the bulk of my data is so close to 0, resampling returns a bunch of negative numbers that mess up my later calculations.

Is there a way to limit my resampling to be all greater than zero, without otherwise changing sample space? I have considered just taking the absolute value of everything to get rid of negatives, but I don’t know if that would be reflective of the distribution weights.

And to clarify, each value that I resample (n) will correspond to a specific variable in my code, so I can’t just delete numbers that are less than zero.

# Here is a little sample dataset if you need something to work this out!
import numpy as np
data = np.array([0.147, 0.066, 0.017, 0.011, 0.040, 0.087, 0.024, 0.127, 0.071, 0.127,
                 0.027, 0.008, 0.067, 0.032, 0.247, 0.028, 0.122, 0.304, 0.074, 0.119])
# Thank you!

Solution

You could use a distribution whose support does not include negative numbers. For example, sampling from an exponential distribution might work for the example array you provided:

import numpy as np
from scipy.stats import expon
import matplotlib.pyplot as plt

data = np.array([0.147, 0.066, 0.017, 0.011, 0.040, 0.087, 0.024, 0.127, 0.071, 0.127, 0.027, 0.008, 0.067, 0.032, 0.247, 0.028, 0.122, 0.304, 0.074, 0.119])

# fit exponential model using data
loc, scale = expon.fit(data)

# plot histogram and model
fig, ax = plt.subplots()
ax.hist(data, density = True)
x = np.linspace(0.01, 1, 200)
ax.plot(x, expon.pdf(x, loc, scale), 'k-')
plt.show()

# sample from your modelled distribution using your fitted loc and scale parameters
sample = expon.rvs(loc, scale)

Answered By – Ben DeVries

Answer Checked By – David Goodson (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *