[SOLVED] Why is islice(permutations) 100 times faster if I keep a reference to the underlying iterator?

Issue

Iterating through islice(permutations(a), n) is somehow 100 times faster if I just keep an extra reference to the permutations iterator. Alternating between with and without the extra reference:

  2.1 ms  with
202.2 ms  without
  2.1 ms  with
195.8 ms  without
  2.1 ms  with
192.4 ms  without

What’s going on?

Full code (Try it online!):

from timeit import timeit
from itertools import permutations, islice
from collections import deque

a = range(10 ** 7)
n = 10 ** 5

for label in ['with', 'without'] * 3:
    if label == 'with':
        perms = islice((foo := permutations(a)), n)
    else:
        perms = islice(permutations(a), n)
    next(perms)
    t = timeit(lambda: deque(perms, 0), number=1)
    print('%5.1f ms ' % (t * 1e3), label)

Solution

I just realized why. If I don’t don’t keep a reference, then the iterator and all it entails gets garbage collected at the end of timing, and that’s included in the time.

Note that the list I build permutations of is very large. So each permutation is very large. So the permutations iterator has a large result tuple and internal state data structure, and I also have millions of integer objects from the range. All that must be cleaned up.

When I halve the size of a to a = range(10 ** 7 // 2), the times for "without" extra reference drop to about half (100 ms).

When I double the size of a to a = range(10 ** 7 * 2), the times for "without" extra reference roughly double (over 400 ms).

Neither change affects the "with" times (always around 2 ms).


In case anyone is wondering why I build permutations of such a large list: I was looking into the time it takes permutations to provide all n! permutations of n elements. One might think it needs O(n × n!), since that’s the overall result size. But it reuses and modifies its result tuple if it can, so it doesn’t build each permutation from scratch but just needs to update it a little. So I tested that with large n in order to see a large speed difference between when it can and can’t reuse its result tuple. It is indeed much faster if it can, and seems to take only O(n!) time to provide all permutations. It appears to on average change just 2.63 elements from one permutation to the next.

Answered By – Kelly Bundy

Answer Checked By – Timothy Miller (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *