[SOLVED] Passing parameters to function in pandas.DataFrame.transform

Issue

I wrote an example where I want to pass mask_size as parameter in mask_first used by pandas.DataFrame.transform.

import pandas as pd
import random
a = [a for a in [1,2,3,4,5] for i in range(25)]
b = [random.randrange(1, 10) for _ in range(0, 125)]
d = {'col1': a, 'col2': b}
df = pd.DataFrame(d)
def mask_first(df, mask_size):
    result = np.ones_like(df)
    result[0:mask_size] = 0
    return result
def apply_mask(df, mask_size = 10):
    mask = df.groupby(['col1'])['col1'].transform(mask_first).astype(bool)
    df = df.loc[mask]
    return df
df = apply_mask(df, mask_size = 10)

It gives me the error code

TypeError: mask_first() missing 1 required positional argument: ‘mask_size’

Solution

According to the documentation on pandas.DataFrame.transform, you should be able to pass arguments to the function using the transform method itself:

DataFrame.transform(func, axis=0, *args, **kwargs)

Where (extract from pandas docs):

*args: Positional arguments to pass to func.

**kwargs: Keyword arguments to pass to func.

Try doing this to pass the mask size argument to your mask_first function:

# Important part: .transform(mask_first, mask_size)
mask = df.groupby(['col1'])['col1'].transform(mask_first, mask_size).astype(bool)

Answered By – aaossa

Answer Checked By – Marilyn (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published.