## Issue

I have a sample array

```
import numpy as np
a = np.array(
[
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
[13, 14, 15],
]
)
```

And an array of indices for which I would like to get averages from

```
b = np.array([[1,3], [1,2], [2,3]])
```

In addition, I need the final result to have the first row concatenated to each of these averages

I can get the desired result using this

```
np.concatenate( (np.tile(a[0],(3,1)), a[b].mean(1)), axis=1)
```

```
array([[ 1. , 2. , 3. , 7. , 8. , 9. ],
[ 1. , 2. , 3. , 5.5, 6.5, 7.5],
[ 1. , 2. , 3. , 8.5, 9.5, 10.5]])
```

I am wondering if there is a more computationally efficient way, as I’ve heard concatenate is slow

Numpy concatenate is slow: any alternative approach?

I’m thinking there might be a way with a combinatin of advanced indexing, `.mean()`

, and reshape, but I am not able to come up with anything that gives the desired array.

## Solution

The problem is not that `concatenate`

is slow. In fact, it is not so slow. The problem is to use it in a loop so to produce a **growing array**. This pattern is very inefficient because it produces many temporary array and copies. However, in your case you do not use such a pattern so this is fine. Here, `concatenate`

is properly used and perfectly match with your intent. You could create an array and fill the left and the right part separately, but this is what `concatenate`

should do in the end. That being said, ** concatenate has a quite big overhead mainly for small arrays** (like most Numpy functions) because of many internal checks** (so to adapt its behaviour regarding the shape of the input arrays). Moreover, the

**implicit casting**from

`np.int_`

to `np.float64`

of `np.tile(a[0],(3,1))`

introduces another overhead. Moreover, note that `mean`

is *not very optimized for such a case*. It is faster to use

`(a[b[:,0]] + a[b[:,1]]) * 0.5`

although the intent is less clear.```
n, m = a.shape[1], b.shape[0]
res = np.empty((n, m*2), dtype=np.float64)
res[:,m] = a[0] # Note: implicit conversion done here
res[:,m:] = (a[b[:,0]] + a[b[:,1]]) * 0.5 # Also here
```

The resulting operation is about **3 times faster** on my machine with your example. It may not be the case for big input arrays (although I expect a speed up too).

For big arrays, the best solution is to use a **Numba** (or Cython) code with loops so to avoid the creation/filling of big expensive **temporary arrays**. Numba should also speed up the computation of small arrays because it mostly removes the overhead of Numpy functions (I expect a speed up of about 5x-10x here).

Answered By – Jérôme Richard

Answer Checked By – Clifford M. (BugsFixing Volunteer)