[SOLVED] What regular expression would allow me to remove numbers up until english characters start? (python)

Issue

I need to turn strings into a certain format:

#1 - New York to New York

#4 - London to London

etc.

I did originally just remove special characters, however this included spaces and therefore I had errors such as NewYork.

My orignal way:

''.join(filter(str.isalpha, myString))

So I basically need to remove the #, number, spaces (before the city name starts) and the –

Solution

I suggest splitting the string into two chunks with ' - ' substring, and grab the last chunk:

result = myString.split(' - ', 1)[-1]

See a Python demo:

texts = ['#1 - New York', '#4 - London']
for myString in texts:
    print( myString, '=>', myString.split(' - ', 1)[-1] )

Output:

#1 - New York => New York
#4 - London => London

Regarding the regex solution, you might want to remove any non-letters at the start of the string with re.sub(r'^[\W\d_]+', '', myString) or re.sub(r'^[^a-zA-Z]+', '', myString). Note [\W\d_]+ is a fully Unicode aware pattern while ^[^a-zA-Z]+ is ASCII only.

Answered By – Wiktor Stribiżew

Answer Checked By – Pedro (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *