NoSQL Essentials – Part 4 – Column-family databases

Column-Family Databases

Column-family databases aren’t a million miles different from document databases. They hold most of the same pros and cons. There is however one significant difference, column-family stores use peer-to-peer replication, where document databases, use master-slave or primary-secondary replication.

Just like all NoSQL database types, you have a lot of databases choices. Common column-family data stores include HBase, Cassandra, Amazon SimpleDB and Hypertable. HBase was built by Apache and is debatably the most popular column-family store. However recently Cassandra has been getting a lot of attention, so that is what I will be covering.

Differences from relational databases.

  1. In column-family databases, each row consist of a collection of columns=>value pairs. A collection of similar rows then makes up a column family. In a relational databases, this would be equivalent to a collection of rows making up a table. The main difference is that in a column-family database, rows do not have to contain the same columns.
  2. RDBMS impose high cost of schema change for low cost of query change. Column families impose little to no cost in schema change, for slightly more cost in query change.

Continue reading

NoSQL Essentials – Part 1 – Overview

NoSQL Databases

NoSQL was first used in the 1990’s as the name of a file based database, which used SQL, however this is not what we class as NoSQL today. NoSQL got coined again in 2009 by a new wave of databases, aiming to improve on the short comings of relational databases. In the last 2 years, NoSQL has started gaining huge interest throughout the compsci community. Here I hope to teach the core elements of NoSQL, as I know them.

Before I make a fool of myself, I would like to point out, I am far from being experienced with NoSQL databases. The below is almost a scrapbook, for my thoughts on what NoSQL is, why it’s useful, how to use it and things to be aware of. The majority of my knowledge thus far, has come from reading NoSQL Distilled by Pramod J. Sadalage and Martin Fowler. It is a great book and if you find this article useful and interesting, I would seriously recommend the investment.

Continue reading