NoSQL Essentials – Part 6 – Migrations, Polyglot persistence and more

This is the final article on my NoSQL concepts series, before I start to focus on a specific database probably MongoDB. I’ve covered all of the popular NoSQL database types (key-value, document, column-family and graph) as well as given an in-depth overview of NoSQL as a whole. Just to round everything up, I’m writing this article to give some final tips and tricks.

Throughout this series I have been writing about, what I have learnt from reading the NoSQL Distilled book by Martin Fowler and Pramod J. Sadalage. Here I am going to be writing about the final few chapters: Handling schema migration, using polyglot persistence, additional storage engines and how to choose the right database.

Continue reading

NoSQL Essentials – Part 5 – Graph databases

Neo4J

So far I’ve covered key-value, document and column-family databases. All that’s left now are graph databases. However surprisingly, they’re nothing like what I’ve covered so far. They’re used in completely different scenarios to other NoSQL stores and are also structured like no other database out there.

When thinking about graph databases you can almost forget about everything you know about NoSQL. You can forget about tables, columns, rows, aggregates. Instead you need to know about entities aka nodes, edges aka relationships and properties. Common graph databases include Neo4J, Infinite Graph, OrientDB, FlockDB. In this article I am going to be focusing on the most popular of the bunch, Neo4J.

Graph database are commonly used in the social scene. Their main goal is to map huge amounts of data and to find patterns in data. For example this could be to find people you may know based on who your friends know or to finds bands you might like, based on who your friends like.

Continue reading

NoSQL Essentials – Part 4 – Column-family databases

Column-Family Databases

Column-family databases aren’t a million miles different from document databases. They hold most of the same pros and cons. There is however one significant difference, column-family stores use peer-to-peer replication, where document databases, use master-slave or primary-secondary replication.

Just like all NoSQL database types, you have a lot of databases choices. Common column-family data stores include HBase, Cassandra, Amazon SimpleDB and Hypertable. HBase was built by Apache and is debatably the most popular column-family store. However recently Cassandra has been getting a lot of attention, so that is what I will be covering.

Differences from relational databases.

  1. In column-family databases, each row consist of a collection of columns=>value pairs. A collection of similar rows then makes up a column family. In a relational databases, this would be equivalent to a collection of rows making up a table. The main difference is that in a column-family database, rows do not have to contain the same columns.
  2. RDBMS impose high cost of schema change for low cost of query change. Column families impose little to no cost in schema change, for slightly more cost in query change.

Continue reading

NoSQL Essentials – Part 3 – Document databases

Document Databases

Document databases are debatably the most talked about NoSQL database type at the time of writing. They’re easy to understand and really easy to work with. They offer all the benefits of using a key-value database (nearly), plus a lot more. Some of the most popular key-value databases currently includes, MongoDB, CouchDB, Terrastore, OrientDB and RavenDB. In this article I am going to be focusing on MongoDB.

Some people love Mongo, some people hate it, I’ve only played with it, but given it’s the most widely used NoSQL database and the most valued by employers. It’s certainly worth knowing a little about.

Continue reading