[SOLVED] MySQL / BigQuery – Weighted Average & Group By

Issue

I am trying to calculate a weighted average of a dataset and return the maximum value, monthly over a period of 12 months along with its’ corrosponding ticket description.

I’m aware that there are tons of questions out there addressing similar problems, but I have yet to find a solution that combines the syntaxes I believe are required.

Here’s some sample table data:

Month_Begin_Date Priority ticket_about_tag Phone_Time Occurances
2019-02-01 Urgent Power Bill 22.42 36
2019-02-01 Normal Power Bill 3.41 89
2019-05-01 Normal Wifi Issue 45.32 12

Here’s my current query for determining the weighted average:

SELECT (Month_Begin_Date, 
(sum(phone_time * occurances))/sum(occurances)) AS Weighted_Average_Phone_Time
FROM database
GROUP BY month_begin_date

This returns the weighted average total for all ticket_about_tags, monthly.

But I still need to get this so that it displays the maximum weighted average grouped by ticket description. I.e. something that looks like this:

Month_Begin_Date ticket_about_tag Weighted_average_phone_time
2019-01-01 Power Bill 22.42
2019-02-01 Power Bill 3.41
2019-03-01 Wifi Issue 45.32

I’ve tried adding this as a subquery into another query in order to return the data I’m after, like so:

SELECT Month_Begin_date, Ticket_About_Tag, Phone_Average_Handle_Time 
FROM database WHERE CONCAT(month_begin_date,phone_time) IN 
(SELECT CONCAT (Month_Begin_Date, 
(sum(phone_time * occurances))/sum(occurances)) AS Weighted_Average_Phone_Time
FROM database
GROUP BY month_begin_date
)
ORDER BY month_begin_date ASC  

Thanks very much for any assistance

Solution

Not sure I got your question right, but using the following data:

Month_Begin_Date Priority Ticket_About_Tag Phone_Time Occurences
2019-02-01 Urgent Power Bill 22.42 36
2019-02-01 Normal Power Bill 3.41 89
2019-05-01 Normal Wifi Issue 45.32 12
2019-02-01 Urgent Wifi Issue 14.2 7
2019-02-01 Normal Wifi Issue 30.7 5

Is this the query you’re after?

SELECT 
Month_Begin_Date, Ticket_About_Tag,
SUM(Phone_Time * Occurences) / SUM(Occurences) AS Weighted_Average_Phone_Time
FROM `database`
GROUP BY Month_Begin_Date, Ticket_About_Tag
ORDER BY Month_Begin_Date ASC, Ticket_About_Tag ASC;

That gives you a result like the one you posted:

Month_Begin_Date Ticket_About_Tag Weighted_Average_Phone_Time
2019-02-01 Power Bill 8.884880083084106
2019-02-01 Wifi Issue 21.075000206629436
2019-05-01 Wifi Issue 45.31999969482422

Response to your comment

To answer your comment you could:

SELECT
a.Month_Begin_Date, 
a.Ticket_About_Tag, 
b.Max_Weighted_Average_Phone_Time
FROM (

SELECT 
Month_Begin_Date,
Ticket_About_Tag,
SUM(Phone_Time * Occurences) / SUM(Occurences) AS Weighted_Average_Phone_Time
FROM `database`
GROUP BY Month_Begin_Date, Ticket_About_Tag

) a

LEFT JOIN (

SELECT
b1.Month_Begin_Date, 
MAX(b1.Weighted_Average_Phone_Time) AS Max_Weighted_Average_Phone_Time
FROM (

SELECT 
Month_Begin_Date,
Ticket_About_Tag,
SUM(Phone_Time * Occurences) / SUM(Occurences) AS Weighted_Average_Phone_Time
FROM `database`
GROUP BY Month_Begin_Date, Ticket_About_Tag

) b1

GROUP BY b1.Month_Begin_Date

) b ON a.Month_Begin_Date = b.Month_Begin_Date

WHERE a.Weighted_Average_Phone_Time = b.Max_Weighted_Average_Phone_Time

That gives you the following output:

Month_Begin_Date Ticket_About_Tag Max_Weighted_Average_Phone_Time
2019-02-01 Wifi Issue 21.075000206629436
2019-05-01 Wifi Issue 45.31999969482422

There are other ways of doing this, but I think this is by far the easiest way to understand without using other SQL constructs. It reflects your need of going through the same data twice, first to aggregate by month and ticket tag, then to find the maximum of the aggregate data by month.

Answered By – ggioffreda

Answer Checked By – Katrina (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published.