Issue
Here is the problem:
I want to define function which will compare string ratios using fuzzy.ration() within 2 lists (not same size).
It should return entities from list 1, which have at least one ratio bigger than 60 compared with second.
def Matching(list1, list2):
no_matching = []
matching = []
for item1 in list1:
for item2 in list2:
m_score = fuzz.ratio(item1, item2)
if m.score < 60:
no_matching.append(item1)
if m.score > 60:
matching.append(item1)
return(matching, no_matching)
The output is not what I aim for. Which part am I doing wrong – in order to get only the items from list 1 if they have at least one matching from list 2 bigger than 60.
For example:
list1 = ["Real Madrid", "Benfica", "Lazio", "FC Milan"]
list2 = ["Madrid", "Barcelona", "Milan"]
for item1 in list1:
for item2 in list2:
m_score = fuzz.ratio(item1, item2)
print(item1, "&", item1, m_score)
Output is:
Real Madrid & Madrid 71 # greater than 60
Real Madrid & Barcelona 20
Real Madrid & Milan 12
Benfica & Madrid 15
Benfica & Barcelona 50
Benfica & Milan 17
Lazio & Madrid 36
Lazio & Barcelona 29
Lazio & Milan 20
FC Milan & Madrid 29
FC Milan & Barcelona 24
FC Milan & Milan 77 # greater than 60
The function output should be:
matching = ["Real Madrid", "FC Milan"] # since they have at least one ratio bigger than 60
no_matching = ["Benfica", "Lazio"]
Solution
There are duplicate combinations in list1 and list2 that created copies in the no_matching
list. Check if the element is already in the matching list. If yes, don’t add to the no_matching
list. The below code gives the expected output.
from fuzzywuzzy import fuzz
def Matching(list1, list2):
no_matching = []
matching = []
m_score = 0
for item1 in list1:
for item2 in list2:
m_score = fuzz.ratio(item1, item2)
if m_score > 60:
matching.append(item1)
if m_score < 60 and not(item1 in matching):
no_matching.append(item1)
return(matching, no_matching)
list1 = ["Real Madrid", "Benfica", "Lazio", "FC Milan"]
list2 = ["Madrid", "Barcelona", "Milan"]
print(Matching(list1, list2))
Output:
(['Real Madrid', 'FC Milan'], ['Benfica', 'Lazio'])
Answered By – Kabilan Mohanraj
Answer Checked By – Terry (BugsFixing Volunteer)