Issue
I have two 2d arrays that contain XYZ points, A and B.
Array A has the shape (796704, 3) and is my original pointcloud. Each point is unique except for (0, 0, 0) but those don’t matter:
A = [[x_1, y_1, z_1],
[x_2, y_2, z_2],
[x_3, y_3, z_3],
[x_4, y_4, z_4],
[x_5, y_5, z_5],
...]
Array B has the shape (N, 4) and is a cropped version of A (N<796704).
The remaining points did not change and are still equal to their counterpart in A.
The fourth column contains the segmentation value of each point.
The row order of B is completely random and doesn’t match A anymore.
B = [[x_4, y_4, z_4, 5],
[x_2, y_2, z_2, 12],
[x_6, y_6, z_6, 5],
[x_7, y_7, z_7, 3],
[x_9, y_9, z_9, 3]]
I need to reorder the rows of B so that they match the rows of A with the same point and fill in the gaps with a zero row:
B = [[0.0, 0.0, 0.0, 0],
[x_2, y_2, z_2, 12],
[0.0, 0.0, 0.0, 0],
[x_4, y_4, z_4, 5],
[0.0, 0.0, 0.0, 0],
[x_6, y_6, z_6, 5],
[x_7, y_7, z_7, 3],
[0.0, 0.0, 0.0, 0],
[x_9, y_9, z_9, 3],
[0.0, 0.0, 0.0, 0],
[0.0, 0.0, 0.0, 0],
[0.0, 0.0, 0.0, 0]
...]
In the end B should have the shape (796704, 4).
I tried using the numpy_indexed package like it was proposed in this very similar question but the issue here is that B doesn’t contain all the points of A:
import numpy_indexed as npi
B[npi.indices(B[:, :-1], A)]
I’m not familiar with numpy and my only solution would be a for-loop but that would be far to slow for my application. Is there some sort of fast method of solving this problem?
Solution
I managed to solve this problem by using the numpy_indexed package, which I mentioned in my question.
The solution:
A = np.array([[8, 7, 4],
[0, 7, 7],
[4, 3, 0],
[5, 5, 8],
[3, 9, 5]])
B = np.array([[3, 9, 5, 6],
[8, 7, 4, 2],
[4, 3, 0, 5]])
# Create a new, zero-filled, array C with length of A
C = np.zeros((A.shape[0], 4))
# Insert B at the beginning of C
C[:B.shape[0], :B.shape[1]] = B
print(C)
Out:
[[3, 9, 5, 6],
[8, 7, 4, 2],
[4, 3, 0, 5],
[0, 0, 0, 0],
[0, 0, 0, 0]]
# Using the numpy_indexed package reorder the rows.
# The last index of C is used as a fill value in case
# a row wasn't found in A thus filling the gaps with [0,0,0,0]
import numpy_indexed as npi
D = C[npi.indices(C[:, :-1], A, missing=-1)]
print(D)
Out:
[[8, 7, 4, 2],
[0, 0, 0, 0],
[4, 3, 0, 5],
[0, 0, 0, 0],
[3, 9, 5, 6]]
Answered By – Levaru
Answer Checked By – Pedro (BugsFixing Volunteer)