# [SOLVED] Most computationally efficient way to get average of particular pairs of rows, and concatenate all of the results with a particular row

## Issue

I have a sample array

``````import numpy as np

a = np.array(
[
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
[13, 14, 15],
]
)
``````

And an array of indices for which I would like to get averages from

``````b = np.array([[1,3], [1,2], [2,3]])
``````

In addition, I need the final result to have the first row concatenated to each of these averages

I can get the desired result using this

``````np.concatenate( (np.tile(a[0],(3,1)), a[b].mean(1)), axis=1)
``````
``````array([[ 1. ,  2. ,  3. ,  7. ,  8. ,  9. ],
[ 1. ,  2. ,  3. ,  5.5,  6.5,  7.5],
[ 1. ,  2. ,  3. ,  8.5,  9.5, 10.5]])
``````

I am wondering if there is a more computationally efficient way, as I’ve heard concatenate is slow

Numpy concatenate is slow: any alternative approach?

I’m thinking there might be a way with a combinatin of advanced indexing, `.mean()`, and reshape, but I am not able to come up with anything that gives the desired array.

## Solution

The problem is not that `concatenate` is slow. In fact, it is not so slow. The problem is to use it in a loop so to produce a growing array. This pattern is very inefficient because it produces many temporary array and copies. However, in your case you do not use such a pattern so this is fine. Here, `concatenate` is properly used and perfectly match with your intent. You could create an array and fill the left and the right part separately, but this is what `concatenate` should do in the end. That being said, `concatenate` has a quite big overhead mainly for small arrays (like most Numpy functions) because of many internal checks** (so to adapt its behaviour regarding the shape of the input arrays). Moreover, the implicit casting from `np.int_` to `np.float64` of `np.tile(a[0],(3,1))` introduces another overhead. Moreover, note that `mean` is not very optimized for such a case. It is faster to use `(a[b[:,0]] + a[b[:,1]]) * 0.5` although the intent is less clear.

``````n, m = a.shape[1], b.shape[0]
res = np.empty((n, m*2), dtype=np.float64)
res[:,m] = a[0]                            # Note: implicit conversion done here
res[:,m:] = (a[b[:,0]] + a[b[:,1]]) * 0.5  # Also here
``````

The resulting operation is about 3 times faster on my machine with your example. It may not be the case for big input arrays (although I expect a speed up too).

For big arrays, the best solution is to use a Numba (or Cython) code with loops so to avoid the creation/filling of big expensive temporary arrays. Numba should also speed up the computation of small arrays because it mostly removes the overhead of Numpy functions (I expect a speed up of about 5x-10x here).