[SOLVED] Why is single-value tuple plus list is more efficient than appending a number?

Issue

Assume we need to append single number 1 to array a.

In Python we have 5 obvious ways:

  1. a.append(1)
  2. a += [1]
  3. a += 1,
  4. a.extend((1,))
  5. a.extend([1])

Let’s measure it with timeit :

from timeit import timeit

print(timeit("a.append(1)", "a = []", number=10_000_000))
print(timeit("a += [1]", "a = []", number=10_000_000))
print(timeit("a += 1,", "a = []", number=10_000_000))
print(timeit("a.extend((1,))", "a = []", number=10_000_000))
print(timeit("a.extend([1])", "a = []", number=10_000_000))

Here is the console’s output:

5.05412472199896
5.869792026000141
3.1280645619990537
4.988895307998973
8.05588494499898

Why is third one more efficient than others?

Solution

The creation of a tuple (1,) is optimized away by the compiler. On the other hand, the list is always created. Look at dis.dis

>>> import dis
>>> dis.dis('a.extend((1,))')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (extend)
              4 LOAD_CONST               0 ((1,))
              6 CALL_METHOD              1
              8 RETURN_VALUE
>>> dis.dis('a.extend([1])')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (extend)
              4 LOAD_CONST               0 (1)
              6 BUILD_LIST               1
              8 CALL_METHOD              1
             10 RETURN_VALUE

Notice, it takes less byte-code instructions, and merely does a LOAD_CONST on (1,). On the other hand, for the list, BUILD_LIST is called (with a LOAD_CONST for 1).

Note, you can access these constants on the code object:

>>> code = compile('a.extend((1,))', '', 'eval')
>>> code
<code object <module> at 0x10e91e0e0, file "", line 1>
>>> code.co_consts
((1,),)

Finally, as to why += is faster than .extend, well, again if you look at the bytecode:

>>> dis.dis('a += b')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_NAME                1 (b)
              4 INPLACE_ADD
              6 STORE_NAME               0 (a)
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
>>> dis.dis('a.extend(b)')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (extend)
              4 LOAD_NAME                2 (b)
              6 CALL_METHOD              1
              8 RETURN_VALUE

You’ll notice for .extend, it that requires first resolving the method (which takes extra time). Using the operator on the other hand has it’s own bytecode: INPLACE_ADD so everything is pushed down into that C layer (plus, magic methods skip instance namespaces and a bunch of hooplah and are looked up directly on the class).

Answered By – juanpa.arrivillaga

Answer Checked By – Terry (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *