Being pretty new to database optimization in systems with high load I have the following question – lets assume we have the following query (query is with sample data):
SELECT * FROM ticket WHERE ticket_status='draft' AND user_id='789437879' ORDER BY ticket_id DESC LIMIT 0, 15
We already have the following indizes:
CREATE INDEX ticket_status on ticket(ticket_status); CREATE INDEX user_id on ticket(user_id); CREATE INDEX ticket_id on ticket(ticket_id);
Would there be a significant performance benefit for optimizing this query if we do the following:
CREATE INDEX make_that_query_more_efficient on ticket(user_id,ticket_status);
Or does it make barely any difference as all the columns are indexed anyways?
It depends on the query. But definitely in your example.
In many queries, it makes a huge difference.
Here is one discussion on such: http://mysql.rjweb.org/doc.php/index1 . It discusses various indexing strategies on a simple query. The conclusion is that a composite index is clearly optimal for the query.
Meanwhile, there are dozens, maybe hundreds, of examples in this forum where my solution to a performance problem (high CPU, slow query, etc) is primarily to replace a single-column index with a multi-column index. The performance gain is sometimes spectacular when
ORDER BY, and
LIMIT can all be handled by the
A many-to-many mapping table (‘junction’) is a very common schema pattern that novices fail to index properly. More on it: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
For your query, this would make a world of difference:
INDEX(user_id, ticket_status, -- these two can be in either order ticket_id) -- this needs to be last
The execution will quickly drill into the Index’s BTree to the row(s) with
ticket_status='draft' AND user_id='789437879'. It will start at the end of such items and scan backward (
DESC), picking up 15 (or fewer) items. Then it will look up the other columns (
*) and deliver them.
Almost any other index would require scanning more than just 15 items.
As for your indexes.
ticket_idis already the
PRIMARY KEY, do not add
INDEX(ticket_id); it will be useless.
If any of your indexes is a prefix of the index I recommend, DROP it; it will be redundant and in the way. (Use
EXPLAIN SELECTto see this.)
If the Optimizer picked your
(ticket_status), it would look at all the entries with desired status, filter based on
user_id, sort the results, then peel off 15 rows.
If the Optimizer were to use
INDEX(ticket_id), it would start at the end of the ids and work backward. If there were not 15 relevant rows, it would not stop until it scanned the entire table.
Note that my composite index even avoids the sort.
The rest of the indexes may or may not be useful; it depends on whether other queries can use them.
A suitable index is likely to be much more beneficial to a
SELECTthan a burden on
INSERTs; so don’t worry about this tradeoff.
user_idmay be useful to other queries; starting with
statusseems less likely. The single-column
INDEX(ticket_status)is unlikely to be used by any query due to "cardinality".
My 3-column index is likely to be much better than your similar 2-column index. Mine takes care of the
ORDER BYand the
LIMIT; yours needs to gather lots of rows and sort them.
If there are big
TEXTcolumns in the
*, the performance difference may be more dramatic.
Answered By – Rick James
Answer Checked By – Gilberto Lyons (BugsFixing Admin)