Issue
The problem is, I need to calculate average number of pages/hits after reaching the pax
page (including pax hit).
The database is:
CREATE TABLE search (
SESSION_ID INTEGER,
HIT_NUMBER INTEGER,
PAGE VARCHAR(24),
MEDIUM_T VARCHAR(24)
);
INSERT INTO search
(SESSION_ID, HIT_NUMBER, PAGE, MEDIUM_T)
VALUES
('123', '1', 'home', 'direct'),
('123', '2', 'flights_home', 'direct'),
('123', '3', 'results', 'direct'),
('456', '1', 'pax', 'metasearch'),
('789', '1', 'home', 'partners'),
('789', '2', 'flights_home', 'partners'),
('789', '3', 'results', 'partners'),
('789', '4', 'home', 'partners'),
('146', '1', 'results', 'SEM'),
('146', '2', 'pax', 'SEM'),
('146', '3', 'payment', 'SEM'),
('146', '4', 'confirmation', 'SEM');
And my approach is:
SELECT s1.SESSION_ID, COUNT(*) as sCOUNT
FROM search s1
WHERE PAGE = 'pax'
GROUP BY s1.SESSION_ID
UNION ALL
SELECT 'Total AVG', AVG(a.sCOUNT)
FROM (
SELECT COUNT(*) as sCOUNT
FROM search s2
GROUP BY s2.SESSION_ID
) a
Obviously the 3r line is wrong, my code misses the part in which after ‘pax’ is shown starts counting and I don’t have any clue for that.
Thank you in advanced 🙂
Solution
Finding all pax
pages and the ones after it could be done with exists
. Rest is straight forward:
SELECT AVG(hits)
FROM (
SELECT session_id, COUNT(*) AS hits
FROM search AS s1
WHERE page = 'pax' OR EXISTS (
SELECT *
FROM search AS s2
WHERE s2.session_id = s1.session_id
AND s2.hit_number < s1.hit_number
AND s2.page = 'pax'
)
GROUP BY session_id
) AS x
If using MySQL 8 then window functions provide a simpler solution:
WITH cte1 AS (
SELECT session_id, MAX(CASE WHEN page = 'pax' THEN 1 END) OVER (
PARTITION BY session_id
ORDER BY hit_number
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS countme
FROM search
), cte2 as (
SELECT COUNT(*) AS hits
FROM cte1
WHERE countme IS NOT NULL
GROUP BY session_id
)
SELECT AVG(hits)
FROM cte2
Answered By – Salman A
Answer Checked By – Terry (BugsFixing Volunteer)