[SOLVED] Intersect split string with partial words on list (possibly with regex)


I have to lists:

keywords = ['critic', 'argu', 'dog', 'cat']
splitSentences = ['Add', 'critical', 'argument', 'birds']

I need to find how many words in splitSentence begin with words of keywords. In my example, that would be 2 (for critical matching "critic" and argument matching "argu").

The problem is that doing set(keywords).intersection(splitSentences) returns 0. I tried prefixing every word in keywords with ^, but it still returns 0.

Apologies, quite new on Python. I’m working on a Jupyter notebook.


With regex:

import re

for i in keywords:
    count = 0
    pref = '^'+ i
    for word in splitSentences:
        if re.match(pref, word):
            count += 1

The semi one liner:

for i in keywords:
    print(sum([1 for word in splitSentences if word.startswith(i)]))

The one liner:

print({el:sum([1 for word in splitSentences if word.startswith(el)]) for el in keywords})

Answered By – BiRD

Answer Checked By – Gilberto Lyons (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *