[SOLVED] Efficient way to find all the pairs in a list without using nested loop

Issue

Suppose I have a list that stores many 2D points. In this list, some positions are stored the same points, consider the index of positions that stored the same point as an index pair. I want to find all the pairs in the list and return all 2 by 2 index pairs. It is possible that the list has some points repeated more than two times, but only the first match needs to be treated as a pair.

For example, in the below list, I have 9 points in total and there are 5 positions containing repeated points. The indices 0, 3, and 7 store the same point ([1, 1]), and the indicies 1 and 6 store the same point ([2, 3]).

[[1, 1], [2, 3], [1, 4], [1, 1], [10, 3], [5, 2], [2, 3], [1, 1], [3, 4]]

So, for this list, I want to return the index pair as (index 0, index 3) and (index 1, index 6). The only solution I can come up with is doing this is through nested loops, which I code up as following

A = np.array([[1, 1], [2, 3], [1, 4], [1, 1], [10, 3], [5, 2], [2, 3], [1, 1], [3, 4]], dtype=int)

# I don't want to modified the original list, looping through a index list insted.
Index = np.arange(0, A.shape[0], 1, dtype=int) 
Pair = [] # for store the index pair
while Index.size != 0:

    current_index = Index[0]
    pi = A[current_index]
    Index = np.delete(Index, 0, 0)

    for j in range(Index.shape[0]):
        pj = A[Index[j]]
        distance = linalg.norm(pi - pj, ord=2, keepdims=True)
        
        if distance == 0:
            Pair.append([current_index, Index[j]])
            Index = np.delete(Index, j, 0)
            break

While this code works for me but the time complexity is O(n^2), where n == len(A), I’m wondering if is there any more efficient way to do this job with a lower time complexity. Thanks for any ideas and help.

Solution

You can use a dictionary to keep track of the indices for each point.

Then, you can iterate over the items in the dictionary, printing out the indices corresponding to points that appear more than once. The runtime of this procedure is linear, rather than quadratic, in the number of points in A:

points = {}

for index, point in enumerate(A):
    point_tuple = tuple(point)
    if point_tuple not in points:
        points[point_tuple] = []
    points[point_tuple].append(index)

for point, indices in points.items():
    if len(indices) > 1:
        print(indices)

This prints out:

[0, 3, 7]
[1, 6]

If you only want the first two indices where a point appears, you can use print(indices[:2]) rather than print(indices).

Answered By – BrokenBenchmark

Answer Checked By – Gilberto Lyons (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *