Issue
I have a DataFrame with 4000 rows and 5 columns.
They are information from multiple excel workbooks that I read in to one single sheet. Now I want to rearrange them in a horizontal manner, basically every time the header of the original excel sheet appears in the data, I want to move it horizontally.
a b
. .
. .
. .
. .
a b
. .
. .
a b
. .
. .
. .
. .
a b
. .
. .
. .
. .
. .
. .
a b
. .
. .
and I want to have something like
a b a b a b a b a b
. . . . . . . . . .
. . . . . . . . . .
. . . . . .
. . . . . .
. .
. .
Amendment:
symbol weight lqdty date
0 1712 0.007871 7.023737 20210104
1 1726 0.007650 3.221021 20210104
2 1824 0.032955 3.475508 20210104
0 1871 0.006443 4.615002 20210105
1 1887 0.007840 6.678486 20210105
2 1871 0.006443 4.615002 20210105
3 1887 0.007840 6.678486 20210105
0 1871 0.006443 4.615002 20210106
1 1887 0.007840 6.678486 20210106
Solution
I assumed "a", "b" are column names here.
Create groups by where the column appear in the data and set_index
with it. Then filter column names out and stack
the DataFrame. Then we’ll have a MultiIndex Series where outer index level is the groups and the inner level is the column names. Then groupby
+ agg(list)
+ DataFrame
+ transpose
will fetch us the desired DataFrame.
cols = df.columns.tolist()
s = df.stack().groupby(level=[0,1]).agg(list)
out = pd.DataFrame(s.tolist(), index=s.index.get_level_values(1)).fillna('').T
Output:
symbol weight lqdty date symbol weight lqdty date symbol weight lqdty date symbol weight lqdty date
0 1712.0 0.007871 7.023737 20210104.0 1726.0 0.00765 3.221021 20210104.0 1824.0 0.032955 3.475508 20210104.0 1887.0 0.00784 6.678486 20210105.0
1 1871.0 0.006443 4.615002 20210105.0 1887.0 0.00784 6.678486 20210105.0 1871.0 0.006443 4.615002 20210105.0
2 1871.0 0.006443 4.615002 20210106.0 1887.0 0.00784 6.678486 20210106.0
Answered By – enke
Answer Checked By – Senaida (BugsFixing Volunteer)