Issue
I have a collection collection1
with documents like such:
{
_id: 123,
field1: "test",
array1: [
{
array2: [
{
field2: 1,
object1: {
field3: "test"
}
}
]
}
]
}
I am trying to get all the documents from the collection filtered by fields field1
, field2
and field3
. The query I have looks like:
db.collection1.find(
{
field1: "test",
array1: {
$elemMatch: {
array2: {
$elemMatch: {
field2: {
$gte: 1
},
"object1.field3": "test"
}
}
}
}
})
This collection has ~125,000 documents. Given how the query has to skim through two nested arrays for filtering, one would expect this query to be slow. And it is, taking around 30-40 seconds.
So, to improve its performance, I created an index for all 3 fields, which looks like db.collection1.createIndex({"array1.array2.object1.field3": 1, "array1.array2.field2": 1, "field1": 1});
Using the index, the query is twice as fast, taking ~15 seconds. However, that is still too slow. I want to get the query <5 seconds. Any ideas on how I can improve the speed? In case it helps, I can add the query planner for both queries (using and not using index).
EDIT: I tried using all 6 possible combinations of different ordering of the fields in the index, and they all had the same results. So then I paid more attention to the query planner and execution stats of the query and I noticed something:
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "db.collection1",
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"indexName" : "fields_index"
}
}
},
"executionStats" : {
"executionSuccess" : true,
"executionTimeMillis" : "15602.784",
"planningTimeMillis" : "0.248",
"executionStages" : {
"stage" : "FETCH",
"nReturned" : "0",
"executionTimeMillisEstimate" : "15602.130",
"inputStages" : [
{
"stage" : "IXSCAN",
"nReturned" : "300220",
"executionTimeMillisEstimate" : "87.616",
"indexName" : "fields_index"
},
{
"nReturned" : "0",
"executionTimeMillisEstimate" : "0.018"
}
]
}
},
"serverInfo" : {
"host" : "mongo-instance",
"port" : 27017,
"version" : "3.6.0"
},
"ok" : 1
It seems that the FETCH
stage is the one taking extremely long, not the index scan. Why is that? Also, with the parameters I am using, the query is meant to return no results. The FETCH
stage does return 0 results, but the index scan returns 300220 documents. Why?
Solution
I found out the issue. Something I didn’t mention in my original question is that I am using AWS’ DocumentDB service, which has MongoDB compatibility. According to this, under the section "$ne, $nin, $nor, $not, $exists, and $elemMatch Indexing", it says that DocumentDB does not support using indexes with $elemMatch
. The reason my query was using the index is because it used it for field1
, which is not under $elemMatch
. However, it did not work for the other two, so it still had to do a scan over thousands of results and filter by field2
and field3
.
The way I fixed it was by rewriting my query. According to MongoDB documentation, I don’t need to use $elemMatch
for my query. So my query now looks like:
db.collection1.find(
{
field1: "test",
"array1.array2.field2": {
$gte: 1
},
"array1.array2.object1.field3": "test"
})
The query functionally does the exact same, but this way it actually uses the index. It now takes <1 second to run the query. Thanks for all the help and great suggestions from everyone!
Answered By – Tiago Silva
Answer Checked By – Timothy Miller (BugsFixing Admin)