[SOLVED] search in list efficiency way (nested lis)


I have a nested list with 514000 list.
I have to find duplicated list and store their position.
I wrote a code, but with bad efficiency way.514000*514000
Do you have a goold solution with good efficiency way ?
my nested list:

and so on


A much faster solution is to use a set to track the items already seen so far and add the location of the duplicates to a list. Because lists are not hashable, they can be converted to a tuple. Here is the resulting code:

seen = set()  # Unique items
result = []   # List of duplicate positions
for pos, item in enumerate(lst):
    tmp = tuple(item)
    if tmp in seen:

If you want to track where is the unique item associated to the duplicates found, you can just change seen to a dict and change seen.add(tmp) to seen[tmp] = pos. The resulting solution runs in linear time (ie. O(n) where n is the size of the input lst). This code takes about 250 ms on my machine with a list containing 514000 sub-lists of 4 short random strings.

Answered By – Jérôme Richard

Answer Checked By – Marie Seifert (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *