[SOLVED] How to rank each item at the same location across different lists in python?

Issue

I am R programmer, but need to achieve rankings in a table using python. Let’s say I have a column "test" with a list of number lists:

df = pd.DataFrame({"test":[[1,4,7], [4,2,6], [3,8,1]]})

I expected to rank each item at the same location across rows (lists), and average all ranks to get a final score:

expected:

       test      rank_list    final_score
0   [1, 4, 7]    [1, 2, 3]       2
1   [4, 2, 6]    [3, 1, 2]       2
2   [3, 8, 1]    [2, 3, 1]       2

I know it is not a good example that all final scores are the same, but with hundreds of rows, the results will be various. I hope I describe the questions clearly, but if not, please feel free to ask.

I don’t know if I can do it in pandas, but I tried zip + scipy, but scipy.stats.rankdata did not give the rank on item at the same index:

l = list(dff["test"])
ranks_list = [scipy.stats.rankdata(x) for x in zip(*l)] # not right
estimated_rank = [sum(x) / len(x) for x in ranks_list]

I am open to any kinds of packages, whichever is convenient. Thank you!

Solution

import numpy as np

# Create a numpy array
a = np.array([[1,4,7], [4,2,6], [3,8,1]])

# get the index of the sorted array along each row
# Python uses zero-based indexing so we add 1
rank_list = np.argsort(a, axis=0) + 1

# calculate the average rank of each column
final_score = np.mean(rank_list, axis=1)

Answered By – kwsp

Answer Checked By – David Goodson (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *