I have two arrays
B that have size
(10000,100,100) (very large). I need to perform a series of operations to pass them to other functions. My question is: how can I save the most amount of memory? Let me give a specific example.
A = np.random.rand(10000,100,100) B = np.random.rand(10000,100,100) def ave_l2_error(diffs): for err in diffs: print(np.mean(err)) def ave_l1_error(diffs): for err in diffs: print(np.mean(err)) #Is there a difference in terms of memory usage between doing this: L2 = [np.power(A-B, 2)] L1 = [np.abs(A-B)] ave_l2_error(L2) ave_l1_error(L1) #vs this: ave_l2_error([np.power(A-B, 2)]) ave_l1_error([np.abs(A-B)])
I would think the first case uses more memory because it saves
L2. This reddit thread discusses renaming variables, but this is a slightly different situation (or maybe not). Would here the garbage collector detect
L2 are not used anymore, and hence it deletes them? What if the code is run in IPython (instead of a shell), where one has access to variables? Would that case make a difference?
In the first version, the arrays created by
np.abs() will stay in memory until the script ends, because the variables prevent them from becoming garbage.
In the second version, the arrays will be garbage collected when the function returns, because they were only assigned to the function parameters, which go away when the function exits. So this version will use less memory.
You can make the first version like the second if you reassign or delete the variables after using them in the function calls.
Answered By – Barmar
Answer Checked By – Katrina (BugsFixing Volunteer)