[SOLVED] replace() method not working on Pandas DataFrame

Issue

I have looked up this issue and most questions are for more complex replacements. However in my case I have a very simple dataframe as a test dummy.

The aim is to replace a string anywhere in the dataframe with an nan, however this does not seem to work (i.e. does not replace; no errors whatsoever). I’ve tried replacing with another string and it does not work either. E.g.

d = {'color' : pd.Series(['white', 'blue', 'orange']),
   'second_color': pd.Series(['white', 'black', 'blue']),
   'value' : pd.Series([1., 2., 3.])}
df = pd.DataFrame(d)
df.replace('white', np.nan)

The output is still:

      color second_color  value
  0   white        white      1
  1    blue        black      2
  2  orange         blue      3

Solution

You need to assign back

df = df.replace('white', np.nan)

or pass param inplace=True:

In [50]:
d = {'color' : pd.Series(['white', 'blue', 'orange']),
   'second_color': pd.Series(['white', 'black', 'blue']),
   'value' : pd.Series([1., 2., 3.])}
df = pd.DataFrame(d)
df.replace('white', np.nan, inplace=True)
df

Out[50]:
    color second_color  value
0     NaN          NaN    1.0
1    blue        black    2.0
2  orange         blue    3.0

Most pandas ops return a copy and most have param inplace which is usually defaulted to False

Answered By – EdChum

Answer Checked By – Mary Flores (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *