# [SOLVED] Fastest Way to generate 1,000,000+ random numbers in python

## Issue

I am currently writing an app in python that needs to generate large amount of random numbers, FAST. Currently I have a scheme going that uses numpy to generate all of the numbers in a giant batch (about ~500,000 at a time). While this seems to be faster than python’s implementation. I still need it to go faster. Any ideas? I’m open to writing it in C and embedding it in the program or doing w/e it takes.

Constraints on the random numbers:

• A Set of 7 numbers that can all have different bounds:
• eg: [0-X1, 0-X2, 0-X3, 0-X4, 0-X5, 0-X6, 0-X7]
• Currently I am generating a list of 7 numbers with random values from [0-1) then multiplying by [X1..X7]
• A Set of 13 numbers that all add up to 1
• Currently just generating 13 numbers then dividing by their sum

Any ideas? Would pre calculating these numbers and storing them in a file make this faster?

Thanks!

## Solution

You can speed things up a bit from what mtrw posted above just by doing what you initially described (generating a bunch of random numbers and multiplying and dividing accordingly)…

Also, you probably already know this, but be sure to do the operations in-place (*=, /=, +=, etc) when working with large-ish numpy arrays. It makes a huge difference in memory usage with large arrays, and will give a considerable speed increase, too.

``````In [53]: def rand_row_doubles(row_limits, num):
....:     ncols = len(row_limits)
....:     x = np.random.random((num, ncols))
....:     x *= row_limits
....:     return x
....:
In [59]: %timeit rand_row_doubles(np.arange(7) + 1, 1000000)
10 loops, best of 3: 187 ms per loop
``````

As compared to:

``````In [66]: %timeit ManyRandDoubles(np.arange(7) + 1, 1000000)
1 loops, best of 3: 222 ms per loop
``````

It’s not a huge difference, but if you’re really worried about speed, it’s something.

Just to show that it’s correct:

``````In [68]: x.max(0)
Out[68]:
array([ 0.99999991,  1.99999971,  2.99999737,  3.99999569,  4.99999836,
5.99999114,  6.99999738])

In [69]: x.min(0)
Out[69]:
array([  4.02099599e-07,   4.41729377e-07,   4.33480302e-08,
7.43497138e-06,   1.28446819e-05,   4.27614385e-07,
1.34106753e-05])
``````

Likewise, for your “rows sum to one” part…

``````In [70]: def rand_rows_sum_to_one(nrows, ncols):
....:     x = np.random.random((ncols, nrows))
....:     y = x.sum(axis=0)
....:     x /= y
....:     return x.T
....:

In [71]: %timeit rand_rows_sum_to_one(1000000, 13)
1 loops, best of 3: 455 ms per loop

In [72]: x = rand_rows_sum_to_one(1000000, 13)

In [73]: x.sum(axis=1)
Out[73]: array([ 1.,  1.,  1., ...,  1.,  1.,  1.])
``````

Honestly, even if you re-implement things in C, I’m not sure you’ll be able to beat numpy by much on this one… I could be very wrong, though!

Answered By – Joe Kington

Answer Checked By – Marie Seifert (BugsFixing Admin)