[SOLVED] How to groupby column of release year by decade

Issue

I’m working on a dataframe of Netflix’s movies. I have a column which has the release year of each one and I would like to group this column by decade.

The column data type is int64 and when I group my df by release year it looks like this:

dates = df.groupby("release_year", as_index=False).count()
dates.sort_values('listed_in', ascending= False).head(10)

    release_year    listed_in
70  2018            1147
69  2017            1032
71  2019            1030
72  2020            953
68  2016            902
73  2021            592
67  2015            560
66  2014            352
65  2013            288
64  2012            237

Now I want to group them by decade. I’ve tried this:

dates.apply(lambda x: (x//10)*10).count()

But it doesn’t work.

What should I do instead?

Thanks in advance!

Solution

Try:

out = df.groupby(df["release_year"] // 10).count()
out.index.name = "decade"
out = out.reset_index().assign(decade=out.index * 10)
print(out)

Prints:

   decade  release_year  listed_in
0    2010             8          8
1    2020             2          2

Answered By – Andrej Kesely

Answer Checked By – Clifford M. (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *