# [SOLVED] Which one is faster np.vstack, np.append, np.concatenate or a manual function made in cython?

## Issue

I coded some program which updates a `numpy` list in each iteration and does some operations on it. the number of iterations depends on time. for example in 1 second, there might be 1000 to 2500 iterations. It means that the items in the numpy list wouldn’t be more than 2500 for running program for 1 second.

I had implemented a basic algorithm which I am not sure if it’s the fastest way to calculate `bonus`:

``````import numpy as np

cdef int[:, :] pl_list
cdef list pl_length
cdef list bonus
pl_list = np.array([[8, 7]], dtype=np.int32)

def modify(pl_list, pl_length):
cdef int k_const = 10
mean = np.mean(pl_list, axis=0)
mean = np.subtract(mean, pl_length)
dev = np.std(pl_list, axis=0)
mean / dev if dev != 0 else 0
mean / dev if dev != 0 else 0

bonus = -1 + (2 / (1 + np.exp(-k_const * mean)))
return list(bonus)

for i in range(2499): # I just simplified the loop. the main loop works like startTime - time.clock() < seconds
rand = np.random.randint(8, 64)
pl_length = [rand, rand-1]

pl_list = np.append(pl_list, [pl_length], axis=0)
bonus = modify(pl_list, pl_length)
``````

I was thinking of speed up this program using these ideas:

1. using `np.vstack`, `np.stack` or maybe `np.concatenate` instead of `np.append(pl_list, [pl_length])`.(which one might be faster?)
2. Using self-made functions to calculate the np.std, np.mean like this (because iterating in memoryviews are so fast in cython):

`cdef int i,sm = 0`
`for i in range(pl_list.shape):`
`sm += pl_list[i]`
`mean = sm/pl_list.shape`

3. I was also thinking of defining a static length(like 2500) for memoryviews, so I wouldn’t need to use `np.append` and I can build a queue structure on that numpy list. (How about Queue library? Is that faster than numpy lists in such operations?)

Sorry if my questions are too many and complicated. I’m just trying to get the best possible performance in speed.

## Solution

Ignoring for the `modify` function, the core of your loop is:

``````pl_list = np.array([[8, 7]], dtype=np.int32)
....

for i in range(2499):
....
pl_list = np.append(pl_list, [pl_length], axis=0)
...
``````

As a general rule we discourage the use of `np.concatenate`, and its derivatives, in a loop. It is faster to append to a list, and do the concatenate once at the end. (more on that later)

Is `pl_list` a list or an array? By name it is a list, but as created it is an array. I haven’t studied `modify` to see whether it requires an array or list.

Look at the source code for functions like `np.append`. The base function is `np.concatenate`, which takes a list, and joins them into a new array along the specified axis. In other words, it works well with a long list of arrays.

`np.append` replaces that list input with 2 arguments. So it has to be applied iteatively. And that is slow. Each append creates a new array.

`np.hstack` just makes sure the list elements are atleast 1d, `np.vstack` makes them 2d, `stack` adds a dimension, etc. So basically they all do the same thing, with minor tweaks to the inputs.

The other model is to allocate a large enough array to start with, e.g. `res = np.zeros((n,2))`, and insert values at `res[i,:] = new_value`. Speeds are about the same as the list append approach. This model can be moved to `cython` and `typed memoryviews` for a (potentially) big speed improvement.