I have simulation program written in Julia that does something equivalent to this as a part of its main loop:
# Some fake data M = [randn(100,100) for m=1:100, n=1:100] W = randn(100,100) work = zip(W,M) result = mapreduce(x -> x*x, +,work)
In other words, a simple sum of weighted matrices. Timing the above code yields
0.691084 seconds (79.03 k allocations: 1.493 GiB, 70.59% gc time, 2.79% compilation time)
I am surprised about the large number of memory allocations, as this problem should be possible to do in-place. To see if it was my use of mapreduce that was wrong I also tested the following equivalent implementation:
@time begin res = zeros(100,100) for m=1:100 for n=1:100 res += W[m,n] * M[m,n] end end end
0.442521 seconds (50.00 k allocations: 1.491 GiB, 70.81% gc time)
So, if I wrote this in C++ or Fortran it would be simple to do all of this in-place. Is this impossible in Julia? Or am I missing something here…?
It is possible to do it in place like this:
function ws(W, M) res = zeros(100,100) for m=1:100 for n=1:100 @. res += W[m,n] * M[m, n] end end return res end
and the timing is:
julia> @time ws(W, M); 0.100328 seconds (2 allocations: 78.172 KiB)
Note that in order to perform this operation in-place I used broadcasting (I could also use loops, but it would be the same).
The problem with your code is that in line:
res += W[m,n] * M[m,n]
You get two allocations:
- When you do multiplication
W[m,n] * M[m,n]a new matrix is allocated.
- When you do addition
res += ...again a matrix is allocated
By using broadcasting with
@. you perform an in-place operation, see https://docs.julialang.org/en/v1/manual/mathematical-operations/#man-dot-operators for more explanations.
Additionally note that I have wrapped the code inside a function. If you do not do it then access both
M is type unstable which also causes allocations, see https://docs.julialang.org/en/v1/manual/performance-tips/#Avoid-global-variables.
Answered By – Bogumił Kamiński
Answer Checked By – Clifford M. (BugsFixing Volunteer)