Issue
I have two dataframes and I want to check which value of df1 in col1 also occurs in df2 in col1. If it occurs: a 1 in col2_new, otherwise a 0. Is it best to do this using a list? So column of df1 converted into list and then a loop over the column of the other data frame or is there a more elegant way?
df1 (before):
index | col1 |
---|---|
1 | a |
2 | b |
3 | c |
df2:
index | col1 |
---|---|
1 | a |
2 | e |
3 | b |
df1 (after):
index | col1 | col2_new |
---|---|---|
1 | a | 1 |
2 | b | 1 |
3 | c | 0 |
Solution
Use Series.isin
with converting mask to integers:
df1['col2_new'] = df1['col1'].isin(df2['col1']).astype(int)
Or:
df1['col2_new'] = np.where(df1['col1'].isin(df2['col1']), 1, 0)
Answered By – jezrael
Answer Checked By – Timothy Miller (BugsFixing Admin)