[SOLVED] Get Groupname after using groupby.size()

Issue

I am using a Billboard-Charts dataset which looks like this:

Billboard-Charts DataFrame

I want to write a function that receives an arbitrary number of artists as parameters. From these artists, i want to determine the one whose songs have been in the charts the longest. I already managed to write the function i wanted but there is one thing i can’t figure out:

How can i get the name of the song which was in the Charts the longest? I just can’t figure how to access the groupname after using the .size() function.

def determine_most_popular_performer(*performers):
    results = []
    for performer in performers:
        results.append((performer, max(df.loc[df["performer"]==performer].groupby("song").size())))
        return max(results)
    
print(determine_most_popular_performer("Queen", "Prince", "Michael Jackson"))
>> ('Queen', 44)

As an output i would want (‘Queen’, ‘Bohemian Rapsody’, 44)

Solution

You can access the max row with .idxmax().

You should then be able to select that row and access the values in that row with the following changes. Note that I used .reset_index() to set the groupby index as a column.

def determine_most_popular_performer(*performers):
    results = []
    for performer in performers:
        df2 = df.loc[df["performer"]==performer].groupby("song").size().reset_index(name="value")
        max_id = df2["value"].idxmax()
        results.append((performer, df2.loc[max_id]["song"], df2.loc[max_id]["value"]))
        return max(results)

Answered By – BrendanA

Answer Checked By – Robin (BugsFixing Admin)

Leave a Reply

Your email address will not be published. Required fields are marked *