Post

Visualizing a Replica Set's Sync Source Chain

A MongoDB replica set is a group of mongod processes that maintain the same data set. The PRIMARY node receives all write operations and The SECONDARY nodes replicate the PRIMARY’s oplog and apply the operations to their data sets such that the secondaries’ data sets reflect the primary’s data set.

Secondaries capture data from the primary member to maintain an up to date copy of the sets’ data unless chained replication is enabled, which changes the replication source selection to allow a secondary member to replicate from another secondary member instead of from the primary.

To determine which node each SECONDARY is syncing from you have to manually review the entries generated by the rs.status() shell helper (or replSetGetStatus command) and parse each node’s syncSourceHost.

When evaluating larger clusters this approach can be cumbersome.

For example, using a 9 node cluster created using mtools and m will produce the following:

1
2
3
# launch a 9 node replica set using MongoDB 4.4.4
m 4.4.4-ent
mlaunch init --replicaset --nodes 9 --binarypath $(m bin 4.4.4-ent)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
> rs.status()
{
    "set" : "replset",
    ...
    "members" : [
        {
            "_id" : 0.0,
            "name" : "localhost:27017",
            ...
            "syncSourceHost" : "",
            ....
        },
        {
            "_id" : 1.0,
            "name" : "localhost:27018",
            ...
            "syncSourceHost" : "localhost:27017",
            ...
        },
        {
            "_id" : 2.0,
            "name" : "localhost:27019",
            ...
            "syncSourceHost" : "localhost:27017",
            ...
        },
        {
            "_id" : 3.0,
            "name" : "localhost:27020",
            ...
            "syncSourceHost" : "localhost:27017",
            ...
        },
        {
            "_id" : 4.0,
            "name" : "localhost:27021",
            ...
            "syncSourceHost" : "localhost:27017",
            ...
        },
        {
            "_id" : 5.0,
            "name" : "localhost:27022",
            ...
            "syncSourceHost" : "localhost:27017",
            ...
        },
        {
            "_id" : 6.0,
            "name" : "localhost:27023",
            ...
            "syncSourceHost" : "localhost:27017",
            ...
        },
        {
            "_id" : 7.0,
            "name" : "localhost:27024",
            ...
            "syncSourceHost" : "localhost:27017",
            ...
        },
        {
            "_id" : 8.0,
            "name" : "localhost:27025",
            ...
            "syncSourceHost" : "localhost:27017",
            ...
        }
    ],
    ...
}

A much more legible version of the above is:

1
> printSyncSourceTree(rs.status());
1
2
3
4
5
6
7
8
9
10
11
Replication Sync Source Tree
============================
-- [0] localhost:27017 (PRIMARY)
---- [1] localhost:27018 (SECONDARY)
---- [2] localhost:27019 (SECONDARY)
---- [3] localhost:27020 (SECONDARY)
---- [4] localhost:27021 (SECONDARY)
---- [5] localhost:27022 (SECONDARY)
---- [6] localhost:27023 (SECONDARY)
---- [7] localhost:27024 (SECONDARY)
---- [8] localhost:27025 (SECONDARY)

The tree above was generated using the printSyncSourceTree() helper function (source code at end of post) from the mongo shell.

When all nodes are syncing from the PRIMARY it’s not difficult to visualize the sync source topology, however let’s mix this up by manually configuring a SECONDARY’s sync target.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
function assignSyncSource(sourceId, syncTargetId) {
  var members = rs.status().members;
  var source = members.filter(obj => { return obj._id === sourceId })[0];
  var target = members.filter(obj => { return obj._id === syncTargetId })[0];
  var conn = new Mongo(source.name);
  var result = conn.adminCommand({ replSetSyncFrom: target.name });
  printjson(result)
}
assignSyncSource(3, 1)
assignSyncSource(2, 1)
assignSyncSource(5, 3)
assignSyncSource(4, 3)

printReplicationTree(rs.status())

The following output is our new replication sync source tree after fiddling with the sync source assignments:

1
2
3
4
5
6
7
8
9
10
11
Replication Sync Source Tree
============================
-- [0] localhost:27017 (PRIMARY)
---- [1] localhost:27018 (SECONDARY)
------ [2] localhost:27019 (SECONDARY)
------ [3] localhost:27020 (SECONDARY)
-------- [4] localhost:27021 (SECONDARY)
-------- [5] localhost:27022 (SECONDARY)
---- [6] localhost:27023 (SECONDARY)
---- [7] localhost:27024 (SECONDARY)
---- [8] localhost:27025 (SECONDARY)

Give this a shot and let me know what you think.

This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.