[SOLVED] Assign dataframe to variable outside for loop or use it directly inside for loop in Python

Issue

option 1:

a = np.unique(df.values)
for i in range():
  if df2.loc[i,'col1'] in a:
    df2.loc[i,'col2'] = 'Ok'
  else:
    df2.loc[i,'col2'] = 'No'

option 2:

for i in range():
  if df2.loc[i,'col1'] in np.unique(df.values):
    df2.loc[i,'col2'] = 'Ok'
  else:
    df2.loc[i,'col2'] = 'No'

Which is better in terms of memory and speed in Python?

Edited for clarity on the operation inside the for loop.

Solution

Both are inefficient, the second is the worse as you recalculate the unique values at each step.

Use vectorial code instead:

df2['col2'] = df2['col1'].isin(np.unique(df.values)).map({True: 'Ok', False: 'No'})

Answered By – mozway

Answer Checked By – Senaida (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *