[SOLVED] How to replace 0 in multiple columns and specific row with values from another dataframe

Issue

import pandas as pd

data = {'Region': ['A', 'A', 'B', 'B', 'C', 'C'],
        'Description': ['D1', 'D2', 'D1', 'D2', 'D1', 'D2'],
        'Baseline 1':[1,2,3,4,5,6]
        :
        'Baseline N': [some numbers]
        'Regime 1': [0,0,0,0,2,3]
        :
        'Regime N': [some numbers]

        }

df1 = pd.DataFrame(data)

replace_data = {'Region':['A','B','C'],
                 'Values':[6,7,8]}

df2 = pd.DataFrame(replace_data)

I have a dataframe which has ‘Region’ as index and need to replace all the regions where description is ‘D2’ and **Regime 1 to N having 0 ** (any regime like column having 0) with the ‘replace_data’ dataframe value of the particular region.

So Regime 1 from above code example should have final value as [0,6,0,7,2,3] (Notice 0 replaced by 6 & 7 from the replace_data frame for region A and B). For each regime column, this need to be done. Rest of the columns (like baseline) should remain as it is.

I can run a loop for all regime like columns and replace but I feel that will be very inefficient and time consuming since I have 20+ such columns and need to similar operations multiple time. Any efficient way to do this?

df.loc[((df['Description']=='D2'),regime_all_cols].replace({0:df2['Values'].values[0]})

This used to work when I had only 1 region but with multiple it doesn’t work accurately.

Solution

You can try to slice the portion of the DataFrame to replace and replace it with a filled version of the DataFrame

cols = df1.filter(like='Regime').columns.to_list()
d = dict(zip(replace_data['Region'], replace_data['Values']))

repl_df = df1.mask(df1.eq(0)).T.fillna(df1['Region'].map(d)).T

df1.loc[df1['Description'].eq('D2'), cols] = repl_df

alternative as loop:

mask = df1['Description'].eq('D2')
d = dict(zip(replace_data['Region'], replace_data['Values']))
repl = df1['Region'].map(d)

for c in df1.filter(like='Regime'):
    S = df1.loc[mask, c]
    df1.loc[mask, c] = S.mask(S.eq(0)).fillna(repl)

output:

  Region Description  Baseline 1  Baseline N Regime 1 Regime N
0      A          D1           1           9        0        9
1      A          D2           2           9        6        9
2      B          D1           3           9        0        9
3      B          D2           4           9        7        9
4      C          D1           5           9        2        9
5      C          D2           6           9      3.0        9

Answered By – mozway

Answer Checked By – Dawn Plyler (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *