Issue
I have a database with the following relationship:
relationship
These tables link a customer to an address:
Cliente = client
Clientenendereco = Many to many table between Endereco (table) and Cliente (table)
Endereco = address
Endereco_rua = address_street
Endereco_cidade = address_city
Endereco_estado = address_state
The cardinality is:
Cliente NxN Endereco
Endereco 1×1 Endereco_rua
Endereco_rua 1×1 Endereco_cidade
Endereco_cidade 1×1 Endereco_state
And the query I use is:
SELECT cli.cliente_id, cli.cliente_cpf, cli.cliente_nome, TIMESTAMPDIFF(YEAR,
cli.cliente_datanasc, CURDATE()) AS idade, ende.endereco_cep, ende.endereco_numero,
ben.beneficio_numero,
ben.beneficio_codigo, ben.beneficio_valor, ban.banco_cod, ab.banco_agencia_numero,
cbb.conta_banco_numero, ben.beneficio_margem_emprestimo, ben.beneficio_margem_cartao
FROM cliente AS cli
RIGHT JOIN telefonencliente AS tnc FORCE INDEX(`fk_telefone_has_cliente_cliente1_idx`)
ON tnc.cliente_id = cli.cliente_id
RIGHT JOIN telefone AS tel
ON tel.telefone_id = tnc.telefone_id
INNER JOIN beneficio AS ben
ON ben.cliente_id = cli.cliente_id
LEFT JOIN clientenendereco AS ce
ON ce.cliente_id = cli.cliente_id
LEFT JOIN endereco AS ende
ON ende.endereco_id = ce.endereco_id
LEFT JOIN endereco_rua AS er
ON er.endereco_rua_id = ende.endereco_rua_id
LEFT JOIN endereco_cidade AS ec
ON ec.endereco_cidade_id = er.cidade_id
LEFT JOIN endereco_estado AS ee
ON ee.endereco_estado_id = ec.estado_id
LEFT JOIN banco_agencia AS ab
ON ab.banco_agencia_id = ben.fk_banco_agencia_id
LEFT JOIN conta_banco AS cbb
ON cbb.conta_banco_id = ben.fk_conta_banco_id
LEFT JOIN banco AS ban
ON ban.banco_id = ben.fk_banco_id
WHERE ee.endereco_estado_id = 1 AND cli.cliente_id IS NOT NULL AND cli.cliente_id NOT
IN(SELECT ci.cliente_id FROM cliente_inativo AS ci WHERE ci.cliente_id = cli.cliente_id)
AND
cli.cliente_id NOT IN (SELECT ma.cliente_id FROM mailing_alocado AS ma WHERE
ma.cliente_id = cli.cliente_id) AND TIMESTAMPDIFF(YEAR, cli.cliente_datanasc,
CURDATE()) >=18 AND TIMESTAMPDIFF(YEAR, cli.cliente_datanasc, CURDATE()) <=68 AND
(ben.beneficio_datainicio BETWEEN '1990-01-01' AND '2007-12-31')
GROUP BY cli.cliente_id
LIMIT 20000
When I use "ee.endereco_estado_id = 1" the query takes 2 minutes to return the records
On the other hand if I remove "ee.endereco_estado_id = 1" the query takes 0.068 seconds
Here with Explain:
With "ee.endereco_estado_id = 1"
Without "ee.endereco_estado_id = 1"
I noticed that without the "ee.endereco_estado_id = 1" It starts with the table "ben", which is good, however, with the "ee.endereco_estado_id = 1" It starts with the table "ee" which takes a long time.
I don’t know what to do anymore, because I need the records that are in state number 1, please help me I don’t know what I can do or where is my error.
Tables:
endereco ~ 3milions rows
endereco_rua ~ 4milions rows
endereco_cidade ~ 21thousand rows
endereco_estado ~ 26 rows
Solution
Let’s try turning the query inside out to avoid the GROUP BY
, which is probably causing some of the performance hangup.
SELECT ... ((lots of columns))
FROM (
SELECT cliente_id
FROM client
((with the minimal JOINs and WHEREs to get the ids))
ORDER BY .. LIMIT .. -- if needed
) AS ids
JOIN client AS cli ON cli.cliente_id = ids.cliente_id
JOIN (( all the other tables and WHEREs ))
ORDER BY .. -- if needed; note: the inner sort will be lost
I assume you indexes on the various columns in the WHERE
clauses? If not, please provide SHOW CREATE TABLE
and EXPLAIN SELECT...
Please do not use LEFT
when the ‘right’ table is not optional. Example: ee.endereco_estado_id = 1
Please say which JOINs are 1:1 vs 1:many vs many:1. I need to understand whether the joining "explodes" the number of rows (only to have the GROUP BY
implode them). My suggested rewrite assumes that is happening.
Answered By – Rick James
Answer Checked By – Senaida (BugsFixing Volunteer)