# [SOLVED] Outer product of large vectors exceeds memory

## Issue

I have three 1D vectors. Let’s say T with 100k element array, f and df each with 200 element array:

``````T = [T0, T1, ..., T100k]
f = [f0, f1, ..., f200]
df = [df0, df1, ..., df200]
``````

For each element array, I have to calculate a function such as the following:

``````P =  T*f + T**2 *df
``````

My first instinct was to use the NumPy outer to find the function with each combination of f and df

``````P1 = np.outer(f,T)
P2 = np.outer(df,T**2)
P = np.add.outer(P1, P2)
``````

However, in this case, I am facing the ram issue and receiving the following error:

Unable to allocate 2.23 PiB for an array with shape (200, 100000, 200,
100000) and data type float64

Is there a good way that I can calculate this?

My attempt using for loops

``````n=100
f_range = 5e-7
df_range = 1.5e-15

fsrc = np.arange(f - n * f_range, f + n * f_range, f_range) #array of 200
dfsrc = np.arange(df - n * df_range, df + n * df_range, df_range) #array of 200

dfnus=pd.DataFrame(fsrc)
numf=dfnus.shape[0]

dfnudots=pd.DataFrame(dfsrc)
numfdot=dfnudots.shape[0]

test2D = np.zeros([numf,(numfdot)])

for indexf, f in enumerate(fsrc):

for indexfd, fd in enumerate(dfsrc):

a=make_phase(T,f,fd) #--> this is just a function that performs T*f + T**2 *df
zlauf2d=z_n(a, n=1, norm=1) #---> And this is just another function that takes this 1D "a" and gives another 1D element array
test2D[indexf, indexfd]=np.copy(zlauf2d) #---> I do this so I could make a contour plot at the end. It just copys the same thing to 2D

``````

Now my test2D has the shape of (200,200). This is what I want, however the floor loop is taking ages and I want somehow reduce two for loop to at least one.

## Solution

``````P1 = (f[:, np.newaxis] * T).sum(axis=-1)
P2 = (df[:, np.newaxis] * T**2).sum(axis=-1)
P = P1[:, np.newaxis] + P2
``````

Alternatively, using outer:

``````P1 = (np.outer(f, T)).sum(axis=-1)
P2 = (np.outer(df, T**2)).sum(axis=-1)
P = P1[..., np.newaxis] + P2
``````

This produces an array of shape `(f.size, df.size) == (200, 200)`.

Generally speaking, if the final output array size is very large, one can either:

1. Reduce the size of the datatypes. One way is to change the datatypes of the arrays used to calculate the final output via `P1.astype(np.float32)`. Alternatively, some operations allow one to pass in a `dtype=np.float32` as a parameter.
2. Chunk the computation and work with smaller subsections of the result.

Based on the most recent edit, compute an array `a` with shape `(200, 200, 100000)`. Then, take its element-wise norm along the last axis to produce an array `z` with shape `(200, 200)`.

``````a = (
f[:, np.newaxis, np.newaxis] * T
+ df[np.newaxis, :, np.newaxis] * T**2
)

# L1 norm along last axis.
z = np.abs(a).sum(axis=-1)
``````

This produces an array of shape `(f.size, df.size) == (200, 200)`.

Answered By – Mateen Ulhaq

Answer Checked By – Pedro (BugsFixing Volunteer)