Issue
I have a query from a mysql table that results in a dataset as follows
car | colour | sold |
---|---|---|
Benz | Black | 2 |
Benz | White | 1 |
BMW | Black | 3 |
BMW | Green | 2 |
BMW | Blue | 1 |
Toyota | Yellow | 7 |
Toyota | Red | 5 |
Toyota | Blue | 3 |
Toyota | White | 1 |
I’m trying to select the top sold row from each subrow, so my final dataset would be:
car | colour | sold |
---|---|---|
Benz | Black | 2 |
BMW | Black | 3 |
Toyota | Yellow | 7 |
Order doesn’t matter i just need to make sure its the top colour sold for that particular car. I tried using distinct, but that applies to the whole row. I tried using group by that selects a random sold amount not the top one. Any idea what query I should be running for this? Is using subqueries inevitable? its a rather small dataset of about ~100 entries. But id rather not use subqueries for future scaling.
Solution
We can use the function RANK()
in a CTE.
See the dbFiddle link at the bottom for the schema and further testing.
NB: This will not function with older version of mySQL.
with rankedSales as ( select car, colour, sold, rank() over (partition by car order by sold desc) rn from carSales) select car, colour, sold from rankedSales where rn = 1;
car | colour | sold :----- | :----- | ---: Benz | Black | 2 BMW | Black | 3 Toyota | Yellow | 7
db<>fiddle here
Answered By – Kendle
Answer Checked By – Willingham (BugsFixing Volunteer)