Published at 09.10.2015
You are probably here because our first articles baited you, or you came via Google and think about using MongoDB in your application. You probably want to know, how to use it to get an advantage over a relational database like Postgres. Let us first revise the CAP theorem for a second.
So Mongo is on the CP side. But what does it mean specifically?
Table of Contents
Mongo uses replication to prevent the crashing server from ruining your night. Only the primary node accepts writes. It implements all changes into so called Oplog, which is then copied to the secondaries. You probably think about master/slave now, but replication is not exactly master/slave. Instead of you telling who is the master, the nodes elect the primary themselves.
During elections, nodes won’t answer read or write requests. Since the nodes know how many nodes are in the replica set, they can also determine if they have a majority. If you, for example, have 5 nodes and 2 nodes get split off, then the 3 node replica elects a new primary. The other 2 nodes see, that they have no majority and don’t elect a primary. Usually, all read requests go to the primary too, but if consistency is a lesser concern, you can also read from nearby secondaries.
You are probably thinking, why would I read from a potentially inconsistent state? Take a profile image on a social network like Facebook. Is it an issue, if you read an “old” profile photo until replication is done? Doesn’t network speed beat the perfect consistency here? Since the nodes don’t answer requests in a minority situation or during an election, we lose the availability in the theorem.
There are 3 kinds of nodes in MongoDB. Primary, Secondary and Arbiters. By default the Primary answers to read and write requests. The Primary replicates changes to Secondaries. When the Primary fails the Secondary nodes begin an election for a new Primary.
You should always have an uneven number of nodes. Think of a replication set with 4 nodes. The network partitions it in 2 sets of 2 nodes. None of those nodes can reach a majority of nodes, so they elect no primary. The Arbiter solves this problem. Arbiters have no data, they just exist to vote. This way, you can get a majority if you have an even number of data holding nodes.
You can set write concerns in your requests. Default is, that the Primary acknowledges, when the changes are in memory of the server. When you also set the journaling flag, you get acknowledged, when the data is written to the disk. The write flag allows you to set a number of replica set members that you want to replicate the change.
With the flag set to 2 you get an acknowledgement as soon as the Primary and one other node has changed the data. You can also set it to “majority”. Then you get the acknowledgement as soon as you have the change in a majority of the voting nodes. Please keep in mind, that you create a huge overhead while doing many writes with “majority”.
When a primary steps down or disappears the secondaries compare their optime. The optime tells the time of the last primary operation, that has been applied to that Node. Nodes vote for the secondary with the highest optime value and that node will now accept requests as primary of the replication set.
When an old primary reconnects it compares its operation log with the current primary and rolls back to the last common point before getting all changes sent. The rolled back data is stored on the file system. So it is possible to merge this changes back into the database by an administrator.
Mongo saves the documents in a JSON-like format called BSON (binary JSON). Documents can embed other documents and can have a size of up to 16MB. For greater needs there is GridFS, which splits documents into chunks. A “table” in Mongo is called collection. Documents inside a collection can vary-for example you can put students and teachers into one collection.
Both could implement their special fields inside an embedded document or could simply have the fields in the main document. A usual practice is to “toss everything into one document”. One word of warning: you can’t search for an embedded document without having the embedding document first (which means iterating). It can certainly be a disadvantage but not in all cases.
In the entrance blog post we mentioned the example of a mail inbox. Let us now take a look at another example. You have channels and within each there are videos. Think of Youtube. First idea would be, that you simply toss videos into a channel document. But there is a problem, to find the most popular videos, you will need to go over all channels to find the embedded document with the highest popularity. RIP performance.
But wait, you can include object references,so your channel can have a reference to all videos, which can reside in their own collection. Some of you might be saying: “But that’s what RDBMS were doing for decades my friend”. Yes, I admit that, but any relational database designer would visit your house with a pitchfork and torch if you suggested to put comments into the videos table. Why not? Comments belong to videos. Did you ever try to search for a specific comment on Youtube without knowing the video it belongs to?
So, you want to use MongoDB, but don’t want to base a full application on it? You can also use Mongo complimentary to your usual RDBMS. A usual use case for this is, when you require fast access to recently added elements or some elements are hot spots that are loaded very often.
For the second case, think of Youtube again. When I post a video with my 0 subscribers, probably not many people will access it. Now think of a big Youtuber with millions of followers. Whatever that person uploads will be accessed very often. You need to join videos with comments with each access. What if comments can reference comments? Joins within joins within joins. The usable data is hidden in a variety of garbage. Instead of denormalizing your database for better speed, why not take Mongo? You save the video with embedded comments, you can simply access all the stuff without a single join. So instead of you joining your performance away, you simply have to access a single document per access.
Maybe someone of you will now wonder, what we do, if we have so many comments that they won’t fit into the document any more. We can make references to documents into our video, those documents are also filled with comments until they are full, then another is created and referenced in videos, or you do it with GridFS, which makes small chunks out of big documents.
Let me start with the biggest ones: No joins, operations are only atomic on single documents. If you want to change a document and a referenced document, that means 2 separate changes, which are not transactionally connected. If you store your users videos in GridFS and then the user is deleted – you have to remove the object from GridFS, since Mongo has no way of automatically detecting that missing reference.
If you forget that numerous times and will want to remove all unreferenced, you have to iterate over everything. Using the write flag “majority” introduces a lot of overhead while saving, since your client has to wait for an ACK. If you don’t, then you risk a split brain situation until the old primary knows that it has lost the majority.
When you add another “clean” node to the replica set, you can trigger a partition since it might make the primary unresponsive during synchronization, which leads to an election and in some cases, loss of changes. The documentation suggests that you should use a copy of another secondary dataset to make a new node, to avoid a full sync.