Optimizing MongoDB Compound Indexes - The "Equality - Sort - Range" (ESR) Rule

Working in Technical Services at MongoDB I find that time and again customers need assistance understanding why the operations they’ve created indexes for may not be performing optimally. When providing supplementary documentation, the go-to article is “Optimizing MongoDB Compound Indexes” by MongoDB’s A. Jesse Jiryu Davis, however we do not have a formal public reference yet (though DOCS-11790 exists to track this effort).

I’ve presented this topic now at MongoDB.local Toronto 2019 (in “Tips and Tricks for Effective Indexing”) and at MongoDB World 2019 (in “The Sights (and Smells) of a Bad Query”). My colleague Chris Harris has also covered this topic at MongoDB World 2019 (in “Tips and Tricks++ for Querying and Indexing MongoDB”) and again at the MongoDB.local Houston 2019, for which a video is available.

Though we have Jesse’s excellent (and still applicable and valid) article from 2012, I wanted to take this opportunity to collect some thoughts on this topic based on his work and previous presentations.

The ESR “Rule”

The ordering of index keys in a compound index is critically important, and the ESR “Rule” can be used as a rule of thumb to identify the optimal order in most cases.

The reason we are putting “Rule” in quotations is because, though the guidance is applicable in most cases, there are exceptions to be aware of. These exceptions are covered in greater detail in my in “Tips and Tricks for Effective Indexing” presentation.

The “Rules”

Read on →

Working around MongoDB Stitch's "max async work queue" limit

MongoDB Stitch is a great way to build apps quickly with your data that’s already managed by MongoDB Atlas. Though these services empower you to focus on development without having to worry about infrastructure, being a managed service there are occasionally limitations imposed by the vendor.

This article summarizes why this limit exists, as well as how to adapt your MongoDB Stitch Functions to work around it.

Read on →

Identifying and Reclaiming Disk Space in MongoDB

A common question when it comes to MongoDB and the (default) storage engine (WiredTiger) is “Why is it after I removed a bunch of documents my free space didn’t increase”?

The WiredTiger storage engine maintains lists of empty records in data files as it deletes documents. This space can be reused by WiredTiger, but will not be returned to the operating system unless under very specific circumstances.

The amount of empty space available for reuse by WiredTiger is reflected in the output of db.collection.stats() under the heading wiredTiger.block-manager.file bytes available for reuse.

To allow the WiredTiger storage engine to release this empty space to the operating system, you can de-fragment your data file. This can be achieved using the compact command.

As the db.collection.stats() command must be run one collection at a time I’ve written the following script to enhance this functionality as follows:

  • scan all namespaces (databases + collections)
  • include index space details
  • support for sharded collections
  • output to CSV
Read on →

MongoDB Initial Sync Progress Monitoring

Sometimes our replica set members fall off the oplog and the node needs to be resynced. When this happens, an Initial Sync is required, which does the following:

  1. Clones all databases except the local database. To clone, the mongod scans every collection in each source database and inserts all data into its own copies of these collections.
  2. Applies all changes to the data set. Using the oplog from the source, the mongod updates its data set to reflect the current state of the replica set.

When the initial sync finishes, the member transitions from STARTUP2 to SECONDARY.

Some common questions when performing an initial sync of a Replica Set Member are:

  • How do I know if the sync is progressing?
  • How long will this take to complete?
Read on →

What is MongoDB FTDC (aka. diagnostic.data)

Full Time Diagnostic Data Capture (FTDC) was introduced in MongoDB 3.2 (via SERVER-19585), to incrementally collect the results of certain diagnostic commands to assist MongoDB support with troubleshooting issues.

On log rotation or startup, a mongod or mongos will collect and log:

As configured by diagnosticDataCollectionPeriodMillis and defaulting to every 1 second, FTDC will collect the output of the following commands:

When FTDC is enabled (per diagnosticDataCollectionEnabled), the metrics.xxxxxxx files will be stored in diagnosticDataCollectionDirectoryPath which by default is the diagnostic.data directory within the systemLog.path.

With SERVER-21818 (introduced in MongoDB 3.2.13) and SERVER-31400 (introduced in MongoDB 3.4.16) the diagnostic data capture scope was broadened to not only include internal diagnostic commands but system metrics as well. Depending on the host operating system, the diagnostic data may include one or more of the following statistics:

The metrics.xxxxxxx files in the diagnostic.data directory contain only statistics about the performance of the system and the database. They are stored in a compressed format, and are not human-readable.

Just a quick note regarding privacy, regardless of the version, the data in diagnostic.data never contains:

  • Samples of queries, query predicates, or query results
  • Data sampled from any end-user collection or index
  • System or MongoDB user credentials or security certificates

FTDC data contains certain host machine information such as hostnames, operating system information, and the options or settings used to start the mongod or mongos. This information may be considered protected or confidential by some organizations or regulatory bodies, but is not typically considered to be Personally Identifiable Information (PII).

If you want to have a closer look at the diagnostic data collection process, you can inspect the FTDC code.

FTDC Structure

Read on →

Troubleshooting and Fixing Invariant Failure !_featureTracker on MongoDB Startup

I recently found myself troubleshooting another MongoDB startup issue due to potential corruption within a WiredTiger file. As I have previously covered this topic (see “Recovering a WiredTiger collection from a corrupt MongoDB installation”), I wanted to share the diagnostic and troubleshooting journey in case it helps anyone who experiences this issue in the future.

To ensure I could troubleshoot this issue in isolation, I first collected a backup of the necessary files from the affected installation as follows:

1
tar -czvf metadata.tar.gz --exclude=WiredTigerStat* WiredTiger* _mdb_catalog.wt sizeStorer.wt

Once I had this backup I extracted it to a new location, then using m to select the versions of MongoDB to use tried to startup a standalone instance to see if I could reproduce the issue:

1
2
3
4
5
6
7
mkdir -p /tmp/repro
cd /tmp/repro
# move archive from earlier to the new directory first
tar xvf metadata.tar.gz
# This is the version of MongoDB reported to be crashing
m 3.4.18
mongod --dbpath .

Once the mongod started, we were able to see the failure and the process aborts (clipped log sample below).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
2020-01-23T03:58:19.828-0500 I CONTROL  [initandlisten] db version v3.4.18
2020-01-23T03:58:19.828-0500 I CONTROL  [initandlisten] git version: 4410706bef6463369ea2f42399e9843903b31923
...
2020-01-23T03:58:20.187-0500 I -        [initandlisten] Invariant failure !_featureTracker src/mongo/db/storage/kv/kv_catalog.cpp 305
2020-01-23T03:58:20.187-0500 I -        [initandlisten]

***aborting after invariant() failure

2020-01-23T03:58:20.198-0500 F -        [initandlisten] Got signal: 6 (Aborted).
...
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x55bb45c92111]
 mongod(+0x153F329) [0x55bb45c91329]
 mongod(+0x153F80D) [0x55bb45c9180d]
 libpthread.so.0(+0x12890) [0x7f5b7bee5890]
 libc.so.6(gsignal+0xC7) [0x7f5b7bb20e97]
 libc.so.6(abort+0x141) [0x7f5b7bb22801]
 mongod(_ZN5mongo17invariantOKFailedEPKcRKNS_6StatusES1_j+0x0) [0x55bb44f5b234]
 mongod(_ZN5mongo9KVCatalog4initEPNS_16OperationContextE+0x568) [0x55bb458db5e8]
 mongod(_ZN5mongo15KVStorageEngineC1EPNS_8KVEngineERKNS_22KVStorageEngineOptionsE+0x807) [0x55bb458e79f7]
 mongod(+0x124DFFA) [0x55bb4599fffa]
 mongod(_ZN5mongo20ServiceContextMongoD29initializeGlobalStorageEngineEv+0x697) [0x55bb45891627]
 mongod(+0x7F62AC) [0x55bb44f482ac]
 mongod(main+0x96B) [0x55bb44f66a6b]
 libc.so.6(__libc_start_main+0xE7) [0x7f5b7bb03b97]
 mongod(+0x86FFB1) [0x55bb44fc1fb1]
-----  END BACKTRACE  -----
Aborted (core dumped)
Read on →

Current Date Math in MongoDB Aggregations

Last Updated: Jan 28, 2015

A challenge that I’ve had in the past while working with my data in MongoDB has been how to incorporate date math into my aggregations.

1
2
3
4
5
6
7
8
9
10
11
db.foo.insertMany([
{ lastUpdated: new Date(new Date().setDate(new Date().getDate() - 1)) },
{ lastUpdated: new Date(new Date().setDate(new Date().getDate() - 5)) },
{ lastUpdated: new Date(new Date().setDate(new Date().getDate() - 9)) }
]);
db.foo.find();
/*
{ "_id" : ObjectId("5e219c6ecc99b35bb2975d9e"), "lastUpdated" : ISODate("2020-01-16T11:37:18.522Z") }
{ "_id" : ObjectId("5e219c6ecc99b35bb2975d9f"), "lastUpdated" : ISODate("2020-01-12T11:37:18.522Z") }
{ "_id" : ObjectId("5e219c6ecc99b35bb2975da0"), "lastUpdated" : ISODate("2020-01-08T11:37:18.522Z") }
*/

Given the 3 documents we’ve setup above, if I wanted to filter a pipeline to only $match documents that are newer than 1 week old, I would have to resort to using Javascript:

1
2
3
4
5
6
7
8
9
10
// compare lastUpdated to a new Javascript Date object set to
// 7 days from the current date
db.foo.aggregate(
{ $match:
  { lastUpdated: { $gte: new Date(new Date().setDate(new Date().getDate() - 7)) } }
});
/*
{ "_id" : ObjectId("5e219c6ecc99b35bb2975d9e"), "lastUpdated" : ISODate("2020-01-16T11:37:18.522Z") }
{ "_id" : ObjectId("5e219c6ecc99b35bb2975d9f"), "lastUpdated" : ISODate("2020-01-12T11:37:18.522Z") }
*/

Now if your pipeline is running in a non-Javascript environment, the new Date() call within the pipeline would likely throw an exception.

If you’re working with MongoDB 4.2 or newer though, a new $$NOW aggregation variable is available that can be combined with existing pipeline operators to $subtract the number of milliseconds in the number of days to filter from the current date:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// compare lastUpdated to the number of milliseconds in
// 7 days subtracted from the current
db.foo.aggregate(
{ $match:
  { $expr:
    { $let:
      { vars:
        { start:
          { $subtract: ["$$NOW", (7 * 86400000)] }
        },
        in: { $gte: ["$lastUpdated", "$$start"] }
      }
    }
  }
});
/*
{ "_id" : ObjectId("5e219c6ecc99b35bb2975d9e"), "lastUpdated" : ISODate("2020-01-16T11:37:18.522Z") }
{ "_id" : ObjectId("5e219c6ecc99b35bb2975d9f"), "lastUpdated" : ISODate("2020-01-12T11:37:18.522Z") }
*/

I hope you find this as useful as I did. With each major release of MongoDB new features and functionality are being introduced that reduce the “hacks” or “workarounds” we’ve had to do in the past.

If you’re looking for more MongoDB tips and tricks, head on over to Asya’s Stupid Tricks With MongoDB.

Let me know in the comments below if you have any questions, or if you found this useful.


Technical Services Engineering at MongoDB

The goal of this post is to provide a first hand account of what it means to be a Technical Services Engineer at MongoDB, as well as what the journey getting to this point has looked like for me.

WHO AM I?

I have been working in Application Development and Software Engineering for nearly two decades. I started off writing desktop applications in QuickBASIC and Turbo Pascal, then eventually in VB6, VB.NET, C++ and C#. When it was time to shift focus to web development I started off with HTML/JS/CSS (as we all do :P), then in Flash/AS3, Flex, Python, Ruby/Rails and Node.js.

I have been writing software since I was a kid, starting with some automation tools for my mom’s business. I then moved on to building tools to help me cheat at various games I was playing at the time, and eventually got more into emulator programming and reverse engineering. I guess you could say I’ve always loved solving problems programmatically, and especially enjoyed identifying opportunities for automation and custom tooling.

This led me down an informal DevOps track, as I was finding there was a need for optimization in the infrastructure layers that my applications were deployed to. This led me deeper into Linux internals, system administration and network operations.

While I was gaining these new skill-sets my primary focus was always on application development and delivery. Before coming to MongoDB I was working as a Development Lead / System Architect, but I found that my focus was always being drawn back to solving performance challenges at the infrastructure level.

Read on →

Hello MongoDB

Last Updated: Jan 28, 2015

As of August 13th, I am no longer a System Architect at DAC Group. I have a public post on LinkedIn that got some good traction, but to summarize it was time to move on.

I’ve been a software engineer in some capacity or another for nearly 20 years now. The position I’ve taken is as a Technical Services Engineer, which is more of a support role than an active development role.

The decision to make this move wasn’t make lightly. I’ve been working hands on with code or overseeing a team of developers on a day to day basis for most of my professional career. As such, I was also involved with software engineering, and this was no different in my role as a System Architect.

In that role, I was still committing code on a nearly daily basis. If not, I was performing code review, or working on a design for a new system or solution. I would consider this all to still be “hands on”, though I had found myself mired in DevOps work a lot more than I would have liked (there were not sufficient Linux Sysadmins available to assist with the type of server operations oversight that was required).

The role at MongoDB isn’t a traditional “Tech Support” type of role, as it requires a strong knowledge of networking, databases, system design, programming and client services. I’ve been a fan of the MongoDB server for over 8 years now, and have brought it along with me to several new consulting opportunities as well as the full time jobs I’ve help. I believe very strongly in the quality of this product, as well as the peripheral products that they’ve developed.

I think the time has come for a new adventure. This is the first step towards a new career journey with a new company, as opposed to an incremental move upwards within the same professional space.


Troubleshooting a MongoDB Performance Issue

UPDATE (2018-06-28): I actually sent a link to this article to the author of the previous blog post and in her reply she indicates that the improvements to cache management and checkpoint areas were more likely to have improved my situation. Just wanted to call out how approachable the MongoDB team is even with these one-off type issues :). Thanks Sue!

UPDATE (2018-06-21): As we were running MongoDB 3.0.15 while all these issues were going on it’s entirely possible that the optimizations made to the write-ahead log of WiredTiger may have also contributed to this improvement in performance :)

The following is an edited excerpt from an email I sent out internally about an intermittent performance issue we’ve been experiencing for several years now. The daily processing challenges we’ve been experiencing revolved around running server-side javascript in order to produce daily reports. As our data ingestion rates rose and our data processing needs climbed, our server performance continued to degrade. This would occur regardless of the size of the VMs we would spin up.

Postmortem

Our MongoDB cluster is configured with three (3) servers: 1x primary (write-enabled) and 2x secondaries (read-only). These are running on Azure DS14v2 VMs with 8TB of storage (8x 1TB striped via LVM as these were the largest premium SSD-based data disks available at the time).

Aside from the servers being scaled up periodically, this configuration has been constant since the inception of the product.

The only major upgrade came in the form of a migration from 2.6 to 3.0 in 2015. At the time this was a major shift as it required rewriting a number of the underlying system scripts as well as introducing LRS-based storage to try and squeeze some additional performance out of the disks. Why optimize for IOPS? Because the reporting platform was designed to copy a lot of data back and forth in order to generate reports segmented by dimension (“Group”, “Company”, “Country”, “State”, “City”).

This chart (48 hours sampled from 1 week ago) shows Cache Usage spiking and Replication Lag spiking. The cache spikes occur as new writes trigger index activity, which invalidates (dirties) cached memory and causes cache eviction.

Read on →