I’m looking for a Python-based solution to a common task I often run into. (I know, there must be cool SQL-magic for this, but… not now.)
Given a table, we want to change the field
oid is the same.
f2 always contains
bar or is empty if
I’m currently using
itertools.combinations to avoid repeated comparison, which seems more efficient than a double
for-loop. But for very large datasets, it still becomes unhandy… considering the Big-O-Notation.
Can the problem be solved more elegantly via pandas or numpy?
Assumed that f2 always contains bar or is empty if oid is 42
Maybe we could use
groupby + transform
first as well:
df['f2'] = df.replace('', np.nan).groupby('oid')['f2'].transform('first').fillna('')
id f1 f2 oid 0 1 foo bar 42 1 2 foo 49 2 3 baz bar 42
Answered By – enke
Answer Checked By – Willingham (BugsFixing Volunteer)