Last Updated: Feb 14, 2020
Full Time Diagnostic Data Capture (FTDC) was introduced in MongoDB 3.2 (via SERVER-19585), to incrementally collect the results of certain diagnostic commands to assist MongoDB support with troubleshooting issues.
On log rotation or startup, a
mongos will collect and log:
As configured by
diagnosticDataCollectionPeriodMillis and defaulting to every 1 second, FTDC will collect the output of the following commands:
local.oplog.rscollection (mongod only)
When FTDC is enabled (per
metrics.xxxxxxx files will be stored in
diagnosticDataCollectionDirectoryPath which by default is the diagnostic.data directory within the
With SERVER-21818 (introduced in MongoDB 3.2.13) and SERVER-31400 (introduced in MongoDB 3.4.16) the diagnostic data capture scope was broadened to not only include internal diagnostic commands but system metrics as well. Depending on the host operating system, the diagnostic data may include one or more of the following statistics:
- CPU utilization (ex:
- Memory utilization (ex:
- Disk utilization related to performance (ex:
- Network performance statistics (
metrics.xxxxxxx files in the
diagnostic.data directory contain only statistics about the performance of the system and the database. They are stored in a compressed format, and are not human-readable.
Just a quick note regarding privacy, regardless of the version, the data in diagnostic.data never contains:
- Samples of queries, query predicates, or query results
- Data sampled from any end-user collection or index
- System or MongoDB user credentials or security certificates
FTDC data contains certain host machine information such as hostnames, operating system information, and the options or settings used to start the
mongos. This information may be considered protected or confidential by some organizations or regulatory bodies, but is not typically considered to be Personally Identifiable Information (PII).
If you want to have a closer look at the diagnostic data collection process, you can inspect the FTDC code.
Each document is made up of an
type and either a
data field. The
type field is used to identify the document type:
- 0: Metadata Document
- 1: Metric Chunk
data fields will contain “samples” in the form of:
1 2 3 4 5 6 7 8 9 10 11
Samples are collected by
1 2 3 4 5
1 2 3 4 5 6 7 8
This sample will be stored in the
doc field of the metadata document.
1 2 3 4 5
During each collection interval (as configured by
diagnosticDataCollectionPeriodMillis), a metric chunk will be created and a sample will be collected, compressed and stored to the
data document as Binary Data.
1 2 3 4 5 6 7 8
bsondump will default to emitting JSON, so we can interact with this using the
jq utility. For example, if we only want to review the Metadata Document this could be done as follows:
1 2 3 4 5
Working with Metric Chunks is a little more complicated as they are actually zlib compressed BSON documents. We’ll use the
jq utility to only select the first chunk and the Ruby interpreter to decompress the zlib data. Note that the following command can be altered to navigate to other chunks (not only the first) as needed:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
You eagle-eyed Rubyists will notice that we’re clipping the first 4 bytes from the binary data we’re reading from STDIN. This is to drop the header before we try to decompress the stream.
If you don’t do this zlib will complain and fail:
1 2 3
The binary data has now been decompressed, and being BSON data we run it through
bsondump again and voila:
Hopefully this helps shed some light on what FTDC data is and what it contains. In a future post we’ll look into doing something useful with this treasure trove of telemetry our clusters are generating every 1 second or so.