Issue
I have this code:
df = pd.DataFrame({'r': {0: '01', 1: '02', 2: '03', 3: '04', 4:''},\
'an': {0: 'a', 1: 'b,c', 2: '', 3: 'c,a,b',4:''}})
yielding the following dataframe:
r an
0 01 a
1 02 b,c
2 03
3 04 c,a,b
4
Using np.select, the desired output is as follows:
r an s
0 01 a 13
1 02 b,c [88,753]
2 03
3 04 c,a,b [789,48,89]
4
I tried usign the following code:
conditions=[
(df['an']=='a')&(df['r']=='01'),
(df['an']=='b')&(df['r']=='01'),
(df['an']=='c')&(df['r']=='01'),
(df['an']=='d')&(df['r']=='01'),
(df['an']=='')&(df['r']=='01'),
(df['an']=='a')&(df['r']=='02'),
(df['an']=='b')&(df['r']=='02'),
(df['an']=='c')&(df['r']=='02'),
(df['an']=='d')&(df['r']=='02'),
(df['an']=='')&(df['r']=='02'),
(df['an']=='a')&(df['r']=='03'),
(df['an']=='b')&(df['r']=='03'),
(df['an']=='c')&(df['r']=='03'),
(df['an']=='d')&(df['r']=='03'),
(df['an']=='')&(df['r']=='03'),
(df['an']=='a')&(df['r']=='04'),
(df['an']=='b')&(df['r']=='04'),
(df['an']=='c')&(df['r']=='04'),
(df['an']=='d')&(df['r']=='04'),
(df['an']=='')&(df['r']=='04')
]
choices=[
13,
75,
6,
89,
'-',
45,
88,
753,
75,
'-',
0.2,
15,
79,
63,
'-',
48,
89,
789,
15,
'-',
]
df['s']=np.select(conditions, choices)
Unfortunately code above only returned desired output for raw 0 (single), for the other raws it retuned 0.
Is it possible to use np.select with a range of values?
Solution
IIUC, use a container (Series/DataFrame/dictionary) to contain the matches, then reference them using a loop:
# mapping the references, can be any value
df_map = pd.DataFrame({'a': ['sa01', 'sa02', 'sa03', 'sa04'],
'b': ['sb01', 'sb02', 'sb03', 'sb04'],
'c': ['sc01', 'sc02', 'sc03', 'sc04'],
'd': ['sd01', 'sd02', 'sd03', 'sd04'],
'': ['s01', 's02', 's03', 's04'], # optional
}, index=['01', '02', '03', '04']
)
# derive a dictionary
# (you could also manually define the dictionary
# if not all combinations are needed)
d = df_map.stack().to_dict()
# {(0, 'a'): 'sa01',
# (0, 'b'): 'sb01',
# (0, 'c'): 'sc01',
# (0, 'd'): 'sd01',
# (0, ''): 's01',
# (1, 'a'): 'sa02',
# map the values
df['s'] = [l if len(l:=[d.get((r, e)) for e in s.split(',')])>1 else l[0]
for r,s in zip(df['r'], df['an'])]
output:
r an s
0 01 a sa01
1 02 b,c [sb02, sc02]
2 03 s03
3 04 c,a,b [sc04, sa04, sb04]
4 None
Answered By – mozway
Answer Checked By – Cary Denson (BugsFixing Admin)