Post

MongoDB Driver Specifications

To ensure official MongoDB Drivers are developed with consistent functionality and APIs MongoDB maintains a set of public specifications (see GitHub Repository) that driver engineers can reference while implementing functionality. These specifications can (and should) be used by community engineers building a community library or driver that will communicate with a MongoDB cluster.

I will be diving deeper into some of these specifications in upcoming posts but to begin I’d like to summarize the existing specifications below.

Current Specifications

Authentication
MongoDB supports various authentication strategies across various versions. When authentication is turned on in the database, a driver must authenticate before it is allowed to communicate with the server. This spec defines when and how a driver performs authentication with a MongoDB server.
Performance Benchmarking
This document describes a standard benchmarking suite for MongoDB drivers.
BSON Corpus
The official BSON specification does not include test data, so this pseudo-specification describes tests for BSON encoding and decoding. It also includes tests for MongoDB's "Extended JSON" specification (hereafter abbreviated as extjson).
BSON Decimal128 Type Handling
MongoDB 3.4 introduces a new BSON type representing high precision decimal ("\x13"), known as Decimal128. 3.4 compatible drivers must support this type by creating a Value Object for it, possibly with accessor functions for retrieving its value in data types supported by the respective languages.
Round-tripping Decimal128 types between driver and server MUST not change its value or representation in any way. Conversion to and from native language types is complicated and there are many pitfalls to represent Decimal128 precisely in all languages.
Change Streams
As of version 3.6 of the MongoDB server a new $changeStream pipeline stage is supported in the aggregation framework. Specifying this stage first in an aggregation pipeline allows users to request that notifications are sent for all changes to a particular collection. This specification defines the means for creating change streams in drivers, as well as behavior during failure scenarios.
Client Side Encryption
MongoDB 4.2 introduced support for client side encryption, guaranteeing that sensitive data can only be encrypted and decrypted with access to both MongoDB and a separate key management provider (supporting AWS, Azure, GCP, a local provider, and KMIP). Once enabled, data can be seamlessly encrypted and decrypted with minimal application code changes.
Collation
As of MongoDB server version 3.4 (maxWireVersion 5), a collation option is supported by the query system for matching and sorting on language strings in a locale-aware fashion.
Command Monitoring
The performance monitoring specification defines a set of behaviour in the drivers for providing runtime information about commands to any 3rd party APM library as well internal driver use, such as logging.
Wire Compression
This specification describes how to implement Wire Protocol Compression between MongoDB drivers and mongod/mongos.
Compression is achieved through a new bi-directional wire protocol opcode, referred to as OP_COMPRESSED.
Server side compressor support is checked during the initial MongoDB Handshake, and is compatible with all historical versions of MongoDB. If a client detects a compatible compressor it will use the compressor for all valid requests.
Connection Monitoring and Pooling
Drivers currently support a variety of options that allow users to configure connection pooling behavior. Users are confused by drivers supporting different subsets of these options. Additionally, drivers implement their connection pools differently, making it difficult to design cross-driver pool functionality. By unifying and codifying pooling options and behavior across all drivers, we will increase user comprehension and code base maintainability.
Connection String
The purpose of the Connection String is to provide a machine readable way of configuring a MongoClient, allowing users to configure and change the connection to their MongoDB system without requiring any application code changes.
This specification defines how the connection string is constructed and parsed. The aim is not to list all of connection string options and their semantics. Rather it defines the syntax of the connection string, including rules for parsing, naming conventions for options, and standard data types.
CRUD API
The CRUD API defines a set of related methods and structures defining a driver's API. As not all languages/frameworks have the same abilities, parts of the spec may or may not apply. These sections have been called out.
GridFS
GridFS is a convention drivers use to store and retrieve BSON binary data (type "\x05") that exceeds MongoDB’s BSON-document size limit of 16 MiB. When this data, called a user file, is written to the system, GridFS divides the file into chunks that are stored as distinct documents in a chunks collection. To retrieve a stored file, GridFS locates and returns all of its component chunks. Internally, GridFS creates a files collection document for each stored file. Files collection documents hold information about stored files, and they are stored in a files collection.
This spec defines a basic API for GridFS. This spec also outlines advanced GridFS features that drivers can choose to support in their implementations. Additionally, this document attempts to clarify the meaning and purpose of all fields in the GridFS data model, disambiguate GridFS terminology, and document configuration options that were previously unspecified.
Initial DNS Seedlist Discovery
Presently, seeding a driver with an initial list of ReplicaSet or MongoS addresses is somewhat cumbersome, requiring a comma-delimited list of host names to attempt connections to. A standardized answer to this problem exists in the form of SRV records, which allow administrators to configure a single domain to return a list of host names. Supporting this feature would assist our users by decreasing maintenance load, primarily by removing the need to maintain seed lists at an application level.
This specification builds on the Connection String specification. It adds a new protocol scheme and modifies how the Host Information is interpreted.
Load Balancer Support
This specification defines driver behaviour when connected to MongoDB services through a load balancer.
Max Staleness
Read preference gains a new option, "maxStalenessSeconds".
A client (driver or mongos) MUST estimate the staleness of each secondary, based on lastWriteDate values provided in server hello responses, and select only those secondaries whose staleness is less than or equal to maxStalenessSeconds.
Most of the implementation of the maxStalenessSeconds option is specified in the Server Discovery And Monitoring Spec and the Server Selection Spec. This document supplements those specs by collecting information specifically about maxStalenessSeconds.
OP_MSG
OP_MSG is a bi-directional wire protocol opcode introduced in MongoDB 3.6 with the goal of replacing most existing opcodes, merging their use into one extendable opcode.
Handshake
MongoDB 3.4 has the ability to annotate connections with metadata provided by the connecting client. The intent of this metadata is to be able to identify client level information about the connection, such as application name, driver name and version. The provided information will be logged through the mongod/mongos.log and the profile logs; this should enable sysadmins to easily backtrack log entries the offending application. The active connection data will also be queryable through aggregation pipeline, to enable collecting and analyzing driver trends.
After connecting to a MongoDB node a hello command (if Versioned API is requested) or a legacy hello command is issued, followed by authentication, if appropriate. This specification augments this handshake and defines certain arguments that clients provide as part of the handshake.
OCSP Support
This specification is about the ability for drivers to to support OCSP—Online Certificate Status Protocol (RFC 6960)—and two of its related extensions: OCSP stapling (RFC 6066) and Must-Staple (RFC 7633).
Polling SRV Records for mongos Discovery
Currently the Initial DNS Seedlist Discovery functionality provides a static seedlist when a MongoClient is constructed. Periodically polling the DNS SRV records would allow for the mongos proxy list to be updated without having to change client configuration.
This specification builds on top of the original Initial DNS Seedlist Discovery specification, and modifies the Server Discovery and Monitoring specification's definition of monitoring a set of mongos servers in a Sharded TopologyType.
Read and Write Concern
A driver must support configuring and sending read concern and write concerns to a server. This specification defines the API drivers must implement as well as how that API is translated into messages for communication with the server.
Retryable Reads
This specification is about the ability for drivers to automatically retry any read operation that has not yet received any results—due to a transient network error, a "not writable primary" error after a replica set failover, etc.—exactly once.
Retryable Writes
MongoDB 3.6 will implement support for server sessions, which are shared resources within a cluster identified by a session ID. Drivers compatible with MongoDB 3.6 will also implement support for client sessions, which are always associated with a server session and will allow for certain commands to be executed within the context of a server session.
Additionally, MongoDB 3.6 will utilize server sessions to allow some write commands to specify a transaction ID to enforce at-most-once semantics for the write operation(s) and allow for retrying the operation if the driver fails to obtain a write result (e.g. network error or "not writable primary" error after a replica set failover). This specification will outline how an API for retryable write operations will be implemented in drivers. The specification will define an option to enable retryable writes for an application and describe how a transaction ID will be provided to write commands executed therein.
Server Discovery and Monitoring
This spec defines how a MongoDB client discovers and monitors one or more servers. It covers monitoring a single server, a set of mongoses, or a replica set. How does the client determine what type of servers they are? How does it keep this information up to date? How does the client find an entire replica set from a seed list, and how does it respond to a stepdown, election, reconfiguration, or network error?
All drivers must answer these questions the same. Or, where platforms' limitations require differences among drivers, there must be as few answers as possible and each must be clearly explained in this spec. Even in cases where several answers seem equally good, drivers must agree on one way to do it.
Server Monitoring
This spec defines how a driver monitors a MongoDB server. In summary, the client monitors each server in the topology. The scope of server monitoring is to provide the topology with updated ServerDescriptions based on hello or legacy hello command responses.
Server Selection
MongoDB deployments may offer more than one server that can service an operation. This specification describes how MongoDB drivers and mongos shall select a server for either read or write operations. It includes the definition of a "read preference" document, configuration options, and algorithms for selecting a server for different deployment topologies.
Driver Sessions
Version 3.6 of the server introduces the concept of logical sessions for clients. A session is an abstract concept that represents a set of sequential operations executed by an application that are related in some way. This specification is limited to how applications start and end sessions. Other specifications define various ways in which sessions are used (e.g. causally consistent reads, retryable writes, or transactions).
This specification also discusses how drivers participate in distributing the cluster time throughout a deployment, a process known as "gossipping the cluster time". While gossipping the cluster time is somewhat orthogonal to sessions, any driver that implements sessions MUST also implement gossipping the cluster time, so it is included in this specification.
Snapshot Reads
Version 5.0 of the server introduces support for read concern level "snapshot" (non-speculative) for read commands outside of transactions, including on secondaries. This spec builds upon the Sessions Specification to define how an application requests "snapshot" level read concern and how a driver interacts with the server to implement snapshot reads.
SOCKS5 Support
SOCKS5 is a standardized protocol for connecting to network services through a separate proxy server. It can be used for connecting to hosts that would otherwise be unreachable from the local network by connecting to a proxy server, which receives the intended target host’s address from the client and then connects to that address.
Convenient API for Transactions
Reliably committing a transaction in the face of errors can be a complicated endeavor using the MongoDB 4.0 drivers API. This specification introduces a withTransaction method on the ClientSession object that allows application logic to be executed within a transaction. This method is capable of retrying either the commit operation or entire transaction as needed (and when the error permits) to better ensure that the transaction can complete successfully.
Transactions
Version 4.0 of the server introduces multi-statement transactions. This spec builds upon the Driver Sessions Specification to define how an application uses transactions and how a driver interacts with the server to implement transactions.
The API for transactions must be specified to ensure that all drivers and the mongo shell are consistent with each other, and to provide a natural interface for application developers and DBAs who use multi-statement transactions.
Unified Test Format
This project defines a unified schema for YAML and JSON specification tests, which run operations against a MongoDB deployment. By conforming various spec tests to a single schema, drivers can implement a single test runner to execute acceptance tests for multiple specifications, thereby reducing maintenance of existing specs and implementation time for new specifications.
URI Options
Historically, URI options have been defined in individual specs, and drivers have defined any additional options independently of one another. Because of the frustration due to there not being a single place where all of the URI options are defined, this spec aims to do just that—namely, provide a canonical list of URI options that each driver defines.
Versioned API
As MongoDB moves toward more frequent releases (a.k.a. continuous delivery), we want to enable users to take advantage of our rapidly released features, without exposing applications to incompatible server changes due to automatic server upgrades. A versioned API will help accomplish that goal.

Older Specifications

There are also some older specifications in the repository that are there for historic context. If you’re writing a new driver this information is still useful though and should be reviewed.

Handling of DBRefs
DBRefs are a convention for expressing a reference to another document as an embedded document (i.e. BSON type 0x03). Several drivers provide a model class for encoding and/or decoding DBRef documents. This specification will both define the structure of a DBRef and provide guidance for implementing model classes in drivers that choose to do so.
Bulk API
Driver support for Bulk write operations
Enumerating Collections
A driver can contain a feature to enumerate all collections belonging to a database. This specification defines how collections should be enumerated.
Enumerating Databases
A driver can provide functionality to enumerate all databases on a server. This specification defines several methods for enumerating databases.
Enumerating Indexes
A driver can contain a feature to enumerate all indexes belonging to a collection. This specification defines how indexes should be enumerated.
Extended JSON
MongoDB Extended JSON is a string format for representing BSON documents. This specification defines the canonical format for representing each BSON type in the Extended JSON format. Thus, a tool that implements Extended JSON will be able to parse the output of any tool that emits Canonical Extended JSON. It also defines a Relaxed Extended JSON format that improves readability at the expense of type information preservation.
Find, getMore and killCursors commands
The Find, GetMore and KillCursors commands in MongoDB 3.2 or later replace the use of the legacy OP_QUERY, OP_GET_MORE and OP_KILL_CURSORS wire protocol messages. This specification lays out how drivers should interact with the new commands when compared to the legacy wire protocol level messages.
Index Management
The index management spec defines a set of behaviour in the drivers for creating, removing and viewing indexes in a collection. It defines implementation details when required but also provides flexibility in the driver in that one or both of 2 unique APIs can be chosen to be implemented.
ObjectID Format
This specification documents the format and data contents of ObjectID BSON values that the drivers and the server generate when no field values have been specified (e.g. creating an ObjectID BSON value when no _id field is present in a document). It is primarily aimed to provide an alternative to the historical use of the MD5 hashing algorithm for the machine information field of the ObjectID, which is problematic when providing a FIPS compliant implementation. It also documents existing best practices for the timestamp and counter fields.
Write Commands
Method to do writes (insert/update/delete) that declares the write concern up front and support for batch operations
Handling of Native UUID Types
The Java, C#, and Python drivers natively support platform types for UUID, all of which by default encode them to and decode them from BSON binary subtype 3. However, each encode the bytes in a different order from the others. To improve interoperability, BSON binary subtype 4 was introduced and defined the byte order according to RFC 4122, and a mechanism to configure each driver to encode UUIDs this way was added to each driver. The legacy representation remained as the default for each driver.
Server Wire version and Feature List
The 'WireVersion' captures all "protocol events" the write protocol went through. A protocol event is a change in the syntax of messages on the wire or the semantics of existing messages. We may also add "logical" entries for releases, although that's not mandatory.
We use the wire version to determine if two agents (a driver, a mongos, or a mongod) can interact. Each agent carries two versions, a 'max' and a 'min' one. If the two agents are on the same 'max' number, they strictly speak the same wire protocol and it is safe to allow them to communicate. If two agents' ranges do not intersect, they should not be allowed to communicate.
If two agents have at least one version in common they can communicate, but one of the sides has to be ready to compensate for not being on its partner version.

What’s Next

The MongoDB JIRA project for specifications (SPEC) is not publicly available, however the DRIVERS project is. As new features are proposed that will impact all drivers, tickets are added to the DRIVERS project which anyone can review.

More often than not these DRIVERS tickets influence specification changes and are a good source of information as to what changes may be coming.

Interested in any particular specification and want a deeper dive? Let me know in the comments.

This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.