[SOLVED] python – Create a column combining the first and last name columns with a condition

Issue

Iam trying to create a new column (fullname) comprising of the first and familyname based on a condition that if the name column is not empty then the strings in the name column should be replaced in the name column and if the name column is empty , then the first and familyname should be joined and replaced in the the name column

this is how the sample data would look like;

   name         |   firstname      | familyname 
kim humphrey    |    NaN           |    NaN
 NaN            |  moustafa        |   elkashlan
 NaN            |   Joey           |    Lamp

i tried writing a python code below
df_total['Full_Name'] = np.where(df_total[['FAMILYNAME', 'FIRSTNAME']].eq('').any(axis='index'), df_total['NAME'], df_total[['FAMILYNAME', 'FIRSTNAME']].apply(' '.join, axis='index')).all(axis=1)

Though the folloing error was returned;
`TypeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_24080/1087141216.py in <module>
1 df_total[‘Full_Name’] = np.where(df_total[[‘FAMILYNAME’, ‘FIRSTNAME’]].eq(‘NaN’).any(axis=’index’),
2 df_total[‘NAME’],
—-> 3 df_total[[‘FAMILYNAME’, ‘FIRSTNAME’]].apply(‘ ‘.join, axis=’index’)).all(axis=1)

~\anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, raw, result_type, args, **kwargs)
8738 kwargs=kwargs,
8739 )
-> 8740 return op.apply()
8741
8742 def applymap(

~\anaconda3\lib\site-packages\pandas\core\apply.py in apply(self)
686 return self.apply_raw()
687
–> 688 return self.apply_standard()
689
690 def agg(self):

~\anaconda3\lib\site-packages\pandas\core\apply.py in apply_standard(self)
810
811 def apply_standard(self):
–> 812 results, res_index = self.apply_series_generator()
813
814 # wrap results

~\anaconda3\lib\site-packages\pandas\core\apply.py in apply_series_generator(self)
826 for i, v in enumerate(series_gen):
827 # ignore SettingWithCopy here in case the user mutates
–> 828 results[i] = self.f(v)
829 if isinstance(results[i], ABCSeries):
830 # If we have a view on v, we need to make a copy because

TypeError: sequence item 0: expected str instance, float found`

indicated below would be my desired output.

full_name         |  name         |   firstname      | familyname 
kim humphrey      |kim humphrey   |    NaN           |    NaN
moustafa elkashlan|NaN            |  moustafa        |   elkashlan
Joey Lamp         |NaN            |   Joey           |    Lamp

Solution

The code below splits it into two assignment steps: providing values for the indexes of the new column where the name column is not null, and another for where they are null.

mask = df_total['name'].notnull()
df_total.loc[mask, 'full_name'] = df_total.loc[mask, 'name']
df_total.loc[~mask, 'full_name'] = df_total[~mask, 'firstname'].astype(str) + " " + df_total[~mask, 'lastname'].astype(str)

Your approach is sound, but the issue is the .eq(''). NaN does not equal an empty string, and it is a float value instead of a string value, hence the error you’re receiving. .isnull() may be the condition you’re looking for instead:

df_total['full_name'] = np.where(
    df_total[['firstname','familyname']].isnull().any(axis=1),
    df_total['name'],
    df_total[['firstname', 'familyname']].apply(" ".join, axis=1)
)

Answered By – Charles Bushrow

Answer Checked By – Candace Johnson (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *