[SOLVED] Faster way to slice a matrix in Julia

Table of Contents

Issue

I want to iterate over different matrix blocks based on an index variable. You can think of it as how you would compute the individual contributions to the loglikelihood of the different individuals on a model that uses panel data. That being said, I want it to be as fast as it can be.

I’ve already read some questions related to it. But none of them answer my question directly. For example, What is the recommended way to iterate a matrix over rows? shows ways to run over the WHOLE bunch of rows not iterating over blocks of rows. Additionally, Julia: loop over rows of matrix (or not) is also about how to iterate over every row again and not over blocks of them.

So here is my question. Say you have X, which is a 2x9 matrix and an id variable that indexes the individuals in the sample. I want to iterate over them to construct my loglikelihood contributions as fast as possible. I did it here just by slicing the matrix using booleans, but this seems relatively inefficient given I am for each individual checking the entire vector to see if they match or not.

id = [1 1 2 2 2 2 3 3 3 ]
x1 = [1 2 3 4 5 6 7 8 9]
x2 = [10 11 12 13 14 15 16 17 18]
X = transpose([x1 ; x2 ])
# 9×2 transpose(::Matrix{Int64}) with eltype Int64:
# 1  10
# 2  11
# 3  12
# 4  13
# 5  14
# 6  15
# 7  16
# 8  17
# 9  18

#Iteraring over the block of observations. 
for n in unique(id)
  select_row = vec(isone.(n.==transpose(id)))
  X_n = X[ select_row , : ]
  print(X_n)
end
#[1 10; 2 11]
#[3 12; 4 13; 5 14; 6 15]
#[7 16; 8 17; 9 18]

In other languages (see below), I created a matrix that contains each individual’s positions beforehand (outside of the loop) and then index my regressors based on that. Still, I don’t know how to do it in Julia (or if there is a function for it), and I also don’t know if this is the fastest way to go in this language.


Mata example

For example, just for the sake of illustration, in Mata (matrix language from Stata) I would do the following:

mata:
id = 1 \1 \2 \2 \2 \2 \3 \ 3\ 3 
x1 = 1 \2 \3 \4 \5 \6 \7 \ 8\ 9
x2 = 10 \11 \12 \13 \14 \15 \16 \ 17\ 18
X  = x1 , x2
/*this indicate where is each invididual*/
paninfo = panelsetup(id,1) 
paninfo
//       1   2
//    +---------+
//  1 |  1   2  |
//  2 |  3   6  |
//  3 |  7   9  |
//    +---------+
/*this gave me the total number of individuals*/
npanels = panelstats(paninfo)[1]  
npanels 
// 3
for(n=1; n <= npanels; ++n) {
    /* this selects the block of variables for individual n*/
    x_n = panelsubmatrix(X, n, paninfo)  
    x_n 
}
//        1    2
//    +-----------+
//  1 |   1   10  |
//  2 |   2   11  |
//    +-----------+
//        1    2
//    +-----------+
//  1 |   3   12  |
//  2 |   4   13  |
//  3 |   5   14  |
//  4 |   6   15  |
//    +-----------+
//        1    2
//    +-----------+
//  1 |   7   16  |
//  2 |   8   17  |
//  3 |   9   18  |
//    +-----------+
end 

Solution

First, I would recommend you to use vectors instead of matrices:

julia> id = [1, 1, 2, 2, 2, 2, 3, 3, 3]
9-element Vector{Int64}:
 1
 1
 2
 2
 2
 2
 3
 3
 3

julia> x1 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
9-element Vector{Int64}:
 1
 2
 3
 4
 5
 6
 7
 8
 9

julia> x2 = [10, 11, 12, 13, 14, 15, 16, 17, 18]
9-element Vector{Int64}:
 10
 11
 12
 13
 14
 15
 16
 17
 18

julia> X = [x1 x2]
9×2 Matrix{Int64}:
 1  10
 2  11
 3  12
 4  13
 5  14
 6  15
 7  16
 8  17
 9  18

Now the relatively easy way to get grouping of X on id in your context is to use SplitApplyCombine.jl:

julia> res = map(x -> view(X, x, :), groupfind(id))
3-element Dictionaries.Dictionary{Int64, SubArray}
 1 │ [1 10; 2 11]
 2 │ [3 12; 4 13; 5 14; 6 15]
 3 │ [7 16; 8 17; 9 18]

julia> res[1]
2×2 view(::Matrix{Int64}, [1, 2], :) with eltype Int64:
 1  10
 2  11

julia> res[2]
4×2 view(::Matrix{Int64}, [3, 4, 5, 6], :) with eltype Int64:
 3  12
 4  13
 5  14
 6  15

julia> res[3]
3×2 view(::Matrix{Int64}, [7, 8, 9], :) with eltype Int64:
 7  16
 8  17
 9  18

I use views to improve performance of the operation.

Answered By – Bogumił Kamiński

Answer Checked By – David Marino (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *