MongoDB – Part 6 – GridFS

If you could have a wild guess at what GridFS is used for, you’d probably say some kind of file system and you wouldn’t be completely wrong. On the contrary, GridFS isn’t actually a filesystem, but rather a convention for storing large binary data files inside MongoDB.

I haven’t covered storing binary data in MongoDB yet, however it’s possible to store binary data in standard documents, without using GridFS at all. This is possible using the BSON type BinData. It’s not very well supported by the mongo client, however most language drivers have good support, just google “MongoDB BinData”. This is all well and good, in fact if you’re storing under 16MB of data per file, it’s recommended to use this approach. If however you are storing binary files larger than 16MB, GridFS is the way to go. 16MB is the maximum size of a document in MongoDB.

Continue reading

MongoDB – Part 5 – Sharding

Sharding, the art of scalability. That’s a bold statement, after all, there are two very important arts to scalability. What shards allow you to do however, is to scale out. What this means is, you no longer need to keep all of your data on one hard drive or machine, heck not even in the same warehouse or continent. You can have unlimited machines in unlimited geographic locations, connected together and serving data as if it was all the data was stored in one central location.

If you’re not already familiar with the terms scaling up/vertically and scaling out/horizontally, carry on reading. Otherwise you can skip the next few paragraphs.

Continue reading

MongoDB – Part 4 – Replication

CAP, the theorem behind MongoDB and most NoSQL databases, states that no distributed system can provide consistency, availability and partition tolerance. In this article I’m going to be writing about what MongoDB offers in the replication sector and how you can utilise replication to customise all aspects of the CAP theorem.

I’ll be covering: how to create replication sets, how replication works in MongoDB, how to configure individual nodes in a set, how to retrieve the status of nodes, how to configure write concern when executing queries and common CAP configurations.

Continue reading

MongoDB – Part 3 – Indexes

It has been a while since my last post on MongoDB, but I’m back and looking to finish off this series over the next 6-8 weeks (Edit: 6-8 months). Anyway, in this article I’m going to be covering all of the different types of indexes you can use in MongoDB

Most of these you’ll have heard of before, providing you’ve used almost any other DBMS. I’ll be providing examples of how to create each index, when you would want to use them and at the end I’ll throw together a few must know tips.

Continue reading