## Issue

I am trying to efficiently compute a summation of a summation in Python:

WolframAlpha is able to compute it too a high n value: sum of sum.

I have two approaches: a *for* loop method and an np.sum method. I thought the np.sum approach would be faster. However, they are the same until a large n, after which the np.sum has overflow errors and gives the wrong result.

I am trying to find the fastest way to compute this sum.

```
import numpy as np
import time
def summation(start,end,func):
sum=0
for i in range(start,end+1):
sum+=func(i)
return sum
def x(y):
return y
def x2(y):
return y**2
def mysum(y):
return x2(y)*summation(0, y, x)
n=100
# method #1
start=time.time()
summation(0,n,mysum)
print('Slow method:',time.time()-start)
# method #2
start=time.time()
w=np.arange(0,n+1)
(w**2*np.cumsum(w)).sum()
print('Fast method:',time.time()-start)
```

## Solution

(fastest methods, 3 and 4, are at the end)

In a fast NumPy method you need to specify `dtype=np.object`

so that NumPy does not convert Python `int`

to its own dtypes (`np.int64`

or others). It will now give you correct results (checked it up to N=100000).

```
# method #2
start=time.time()
w=np.arange(0, n+1, dtype=np.object)
result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
```

Your fast solution is significantly faster than the slow one. Yes, for large N’s, but already at N=100 it is like 8 times faster:

```
start=time.time()
for i in range(100):
result1 = summation(0, n, mysum)
print('Slow method:', time.time()-start)
# method #2
start=time.time()
for i in range(100):
w=np.arange(0, n+1, dtype=np.object)
result2 = (w**2*np.cumsum(w)).sum()
print('Fast method:', time.time()-start)
```

```
Slow method: 0.06906533241271973
Fast method: 0.008007287979125977
```

EDIT: Even faster method (by KellyBundy, the Pumpkin) is by using pure python. Turns out NumPy has no advantage here, because it has no vectorized code for `np.objects`

.

```
# method #3
import itertools
start=time.time()
for i in range(100):
result3 = sum(x*x * ysum for x, ysum in enumerate(itertools.accumulate(range(n+1))))
print('Faster, pure python:', (time.time()-start))
```

```
Faster, pure python: 0.0009944438934326172
```

EDIT2: Forss noticed that numpy fast method can be optimized by using `x*x`

instead of `x**2`

. For `N > 200`

it is faster than pure Python method. For `N < 200`

it is slower than pure Python method (the exact value of boundary may depend on machine, on mine it was 200, its best to check it yourself):

```
# method #4
start=time.time()
for i in range(100):
w = np.arange(0, n+1, dtype=np.object)
result2 = (w*w*np.cumsum(w)).sum()
print('Fast method x*x:', time.time()-start)
```

Answered By – dankal444

Answer Checked By – Marie Seifert (BugsFixing Admin)