NoSQL Databases

NoSQL was first used in the 1990’s as the name of a file based database, which used SQL, however this is not what we class as NoSQL today. NoSQL got coined again in 2009 by a new wave of databases, aiming to improve on the short comings of relational databases. In the last 2 years, NoSQL has started gaining huge interest throughout the compsci community. Here I hope to teach the core elements of NoSQL, as I know them.

Before I make a fool of myself, I would like to point out, I am far from being experienced with NoSQL databases. The below is almost a scrapbook, for my thoughts on what NoSQL is, why it’s useful, how to use it and things to be aware of. The majority of my knowledge thus far, has come from reading NoSQL Distilled by Pramod J. Sadalage and Martin Fowler. It is a great book and if you find this article useful and interesting, I would seriously recommend the investment.

A gentle introduction

NoSQL is a name coined to a collection of databases that follow a set of common interests. The most important interest of them all, is the ability for NoSQL databases to be able to run on a cluster, allowing them to scale out, where a typical relational database would crumble to its knees. There are a huge amount of NoSQL databases out there, each with it’s own quirks, advantages and disadvantages. In this article however instead of focusing on any databases in particular, I’m going to be focusing on common aspects between NoSQL databases.

I don’t know if this is specific to any of you, but I find using documents with paragraphs of text quite overwhelming. It forces a lot of conjunctive words, repetition and waffling. Therefore I have chosen to write this article in bullet points. Hopefully this will mean that everything is clear and comprehensive.

NoSQL glossary

You normally expect a glossary to be at the end of a document/book, but here it makes sense for you to know about the buzzwords I’ll be using.

Node – This is a server on your cluster that is running a NoSQL database instance.
Cluster – This is a collection of nodes, containing all the data in your database.
Aggregate – A collection of entities (users, orders, addresses), that are stored as a unit inside a node.
Aggregate orientated – A database which makes use of aggregates.
Aggregate-Ignorant – A database which does not use aggregates e.g. Graph databases, relational databases.
Durability – This is how resilient your database is to losing changes. Your database could be classed as lacking durability, if data is stored temporarily in memory or if there is a low write quorum, either way there is a chance data can be lost.
Write quorum – This is how many nodes are updated when data is written. The higher the write quorum, the more consistent your database will be, the side effect here is, latency will also be higher.
Read quorum – This is how many nodes should be read, when running a query. The higher the read quorum, the more you will get up to date data. Below in the consistency section, I will explain how to optimise your read and write quorums for you application requirements.
CAP
1. Consistent – How consistent data is between nodes. Have changes been written to replicas.
2. Availability – NoSQL assumes a node as available if online and returns a valid response. Failed nodes do not affect availability. Therefore availability refers to how available your cluster is to read and write queries.
3. Partition Tolerance – How resilient the cluster is to communication breakages between nodes.
4. It is commonly said that NoSQL database can only prioritise 2 of the 3 CAP. This is not true. In Reality, any system that uses partitions (multiple nodes), will have a trade-off between consistency and availability. E.g. If there are 2 nodes and the read and write quorums are set to 1, if the link between them is broken, both nodes will be available but inconsistent. If there is a read and write quorum of 2 and the link is broken. Both nodes will remain consistent (because neither can be updated) but also unavailable.
Materialised views – These are cached query results that can be updated incrementally or batch updated. This is useful for data which doesn’t require being 100% up to date and is a great way to boost performance. There is also an approach to this called the eager approach, which involves updating views as new data is received, allowing data to stay up to date. Updating as part of a batch process is called the application database approach. Materialised views are created by running queries in your application and creating new aggregates from them.

Benefits of using NoSQL

Quicker development times, this is due to aggregates and not having to worry about database schema.
Easy to scale where relational databases are meant for a single server.
Prevents having to create tables with custom columns like custom1, custom2, etc. This is due to all NoSQL databases being schema-less
Prevents having tables with large amounts of NULL values (sparse table).

Drawbacks of using NoSQL

Schema-less is perfect in some situations, however if a database is shared between applications, there is no way of stopping one application from storing inconsistent data. There are workarounds to this:
1. Encapsulate all database interactions within a single application and integrate this with your other applications using web services.
2. Using different column families for different applications in a column-family database, or different sections for each application in a document database (More on these below).
It’s also hard to determine what can be stored in an aggregate, without diving into the application. It’s always a good idea to keep an implicit schema inside your application documentation.

NoSQL concepts

De-normalisation of your database can significantly improve read performance. It’s also a useful way of storing real time data. Obviously this has a clear drawback of data redundancy.
NoSQL is said to follow the BASE properties (Basically available, Soft state, Eventual consistency) instead of the ACID (Atomic, Consistent, Isolated, Durable) properties in relational databases. This basically means transactions are not possible.
NoSQL databases typically have a trade-off between consistency and latency. There is also a trade-off between durability and latency. This is because, to improve consistency and durability, the read and write quorums must be increased, which will inevitably increase latency times.

Database types

Key-Value databases

Aggregate orientated.
Does not conform to ACID transactions.
The entire unit must be returned as part of a read query.
Data can only be queried on it’s key (similar to a primary key).
Schema-less.

Document databases

Aggregate orientated.
Does not conform to ACID transactions.
Queried data can return an entire aggregate or just part of an aggregate.
Any data in the aggregate can be used when querying.
Schema-less.
Implicit foreign keys can be used to link multiple aggregates together. This is useful if data isn’t normally accessed together and so shouldn’t be stored as a unit. In which case, you might be better off using a relational database.

Column-Family databases

Aggregate orientated.
Does not conform to ACID transactions.
Queried data can return an entire aggregate or just part of an aggregate.
Any data in the aggregate can be used when querying.
Each row needs to be placed into a column family. Rows which are used together, should go into the same column family.
Row-oriented – Each row is an aggregate that contains columns. Each row is placed into a column family.
Column-oriented – Similar to row-oriented apart from each row can appear in multiple column families.
Schema-less.

Graph databases

Aggregate-Ignorant.
Each entity stores a very small amount of data called properties.
Very complicated joins between entities, each join can also contain properties.
Entities are joined to each other at insert time instead of at read time. This slows down write performance slightly, but can significantly improves read performance.
Not meant to be run on a cluster. Although this may be a possibility in the future.
Schema-less.

Distribution models

Single server

There is one server, no sharding and no replication.
This is the most basic kind of distribution and should always be used where possible.
Advantages
1. It’s simple to setup
2. It’s cheap
Drawbacks
1. There is a single point of failure for the entire system.
2. Growing data sets will take longer and longer to read.

Sharding

This involves putting your aggregates over multiple nodes. That way your data is quicker to read because there is less data per shard.
Determining which aggregates goes into which shards, can be decided by any aggregate data. e.g. First letter in a customers name, the year or month, the physical location of the node and the user. It all depends on the application requirements. The goal being to use as few shards as possible per read.
Data should be equally distributed between nodes. Most NoSQL databases perform load balancing natively.
Advantages
1. If one node goes down the application can still run. However data on the down node will be unavailable, unless replicated onto another node (detailed below).
2. Most NoSQL databases provide auto-sharding, meaning data can be automatically allocated to a shard and the correct shard will be chosen when reading.
3. Sharding can massively improve read performance and slightly improve write performance.
4. Sharding with replication can improve read speeds even more but doesn’t do anything for write speeds apart from in column-families, where secondaries nodes can accept writes.
Drawbacks
1. Requires the cost of maintaining multiple servers.
Things to be aware of
1. Sharding provides little resilience without replication. This is because when using a cluster, nodes are typically less reliable. Meaning you could have data lost temporarily quite frequently.
2. Sharding can be tricky when left too late. It is essential to start sharding early. This will prevent a complete outage when trying to shard, on already high load.

Master-Slave replication

Requires a single master node than can be manually or automatically set. When automatically set, a new master node will automatically be assigned if the initial master goes down.
Data is always written to the master node and then replicated to the slaves.
Separate read and write paths should be used, using different connections. This way, if writes goes down (master node), users can still read.
Advantages
1. Enhances read speeds due to there being multiple nodes that can be read from.
2. It doesn’t do much for write speeds. However there may be a slight improvement due to reduced reading traffic.
Drawbacks
1. If data is written to the master and then master goes down before replication is completed. Data will be lost.
2. Data may be written to master, then a read is done to a slave, before the changes have propagates, making it appear like the changes haven’t worked. This can be countered using read-your-writes consistency (described below).
3. The master node is a single point of failure for writes.

Peer-To-Peer replication

Similar to Master-Slave replication, apart from there is no master node.
All nodes can accept read queries and sometimes writes queries.
Advantages
1. Replication nodes do not have to completely mirror other nodes. Instead they can mirror shards from multiple nodes if required. E.g Node 1 could contain shards 1 & 2, Node 2 could contain shards 1 & 3, Node 3 could contain shards 2 & 3.
Drawbacks
1. Possible consistency problems, if there are multiple requests to write at the same time.
2. There are ways to handle write conflicts using version stamping. It’s a trade off between consistency and availability (More on this below).

Combining sharding and replication

Master-Slave replication & sharding
1. Multiple master nodes but each data item only has one master.
Peer-to-Peer & sharding
1. Common in column family databases.
2. Common with thousands of nodes. A good starting point is a factor of 3 replication.

Consistency

This is the process of keeping all of your nodes in sync, regardless of if you’re using master-slave or peer-to-peer replication.

Update Consistency

Problems
1. Multiple users try and update the same record at the same time. This is called a write-write conflict.
2. Without any concurrency control, the 2nd write would overwrite the 1st on a single server database.
3. There are two approaches to maintain consistency when concurrency hits.
  1. Pessimistic Approach – Prevents conflicts
    1. e.g. Using write locks, this would prevent the 2nd write from happening without seeing the changes from the 1st write. This in theory is great, but it can make your application unresponsive and can create deadlocks.
  2. Optimistic Approach – Lets them occur, but action is taken to sort them out
    1. e.g. The client must check if the value has changed before doing the write.
Warnings
1. In peer-to-peer replication, writes may occur on different nodes, without even realising there is another write.
Solutions
1. All changes can be saved and they can be presented to a user to say there are conflicts. This however is the responsibility of the client.
2. Instead of asking a user to resolve a conflict, you may be able to write domain specific code to automatically resolve some conflicts.
3. The read and write quorum can be increased to match your application requirements.

Read consistency

Problems
1. A user makes changes, which requires doing multiple writes. In the middle of the writes, another user does a read, giving them inconsistent data.
2. Replication inconsistency – Two people can read from different nodes and get different results, this is when changes have not fully propagated.
Solutions
1. Logical Consistency – Ensuring that all writes are completed before a read is possible. This is done using transactions, aggregate-oriented databases do not support transactions, however graph databases typically support ACID transactions, just like relations databases.
2. In an aggregate oriented database, if a single aggregate can be used, only one write is required, so there wouldn’t be a read-write conflict.
3. Read-Your-Writes consistency – This prevents writing data and then reading from another node which doesn’t have the data yet. This is done using a sticky session (session that is tied to one node). The downside to this is, it reduces the load balancers ability to do its job.
4. Maintaining a sticky session on a master-slave replication can be tricky, this is because writes are done to master and reads can be done to slaves. Luckily it’s possible to send writes to a slave which can later, forward them on to the master. It is also possible to switch the reading node from a slave to master, until the slave propagates, then switch back to the slave.

Relaxing durability

This is the process of reducing the durability of the database to improve performance.

Creating a data store in memory for highly read but non-essential data, e.g. user sessions. Data can be read from memory and written to a hard disk periodically. If the server goes down before the data is written to hard disk, it will be lost.
Auto-failover of a master node is not always a good idea. Here’s why:
1. Master node contains updates that are not replicated to the slaves yet.
2. Master node goes down and one of the slaves, is appointed as master.
3. Changes happen in the new master too.
4. New master comes back online.
5. Conflict Hell.

Version stamping

This is the process of versioning your aggregates after changes to help prevent read and write inconsistencies. Column-family databases do this automatically.

A vector stamp could be a version number that is incremented with each update, a timestamp, a GUID, a hash of the contents, or a mix of these. Plus any other possibilities you can think of.
The benefit to use an incremental version number or a timestamp, is you can see which revision is the latest very easily.
Optimistic offline lock – This is a conditional update, meaning that just before data is written, there is a final check to see if anything has changed using a version stamp. Some databases provide this functionality natively, so updating stale data isn’t possible.
Distributions
1. Master-Slave distributions – As usual, maintaining a Master-Slave distribution is so much easier, because writes can only happen on the master. The recommended version stamp is an incremental version number. Just before a write is done, you can check if the version number has increased since the last read. You need to make sure there are no changes on the master node, if you are reading from a slave (as it may not have propagated yet).
2. Peer-to-Peer distributions – A term called vector stamping should be used. In this approach, each node will contains its own version code, plus the version code of every other node used by the shard/cluster. When nodes communicate they can sync their version stamps and the latest revision can be chosen, or their can be a conflict which should be manually/automatically resolved.
  1. Scenario: Say we have 2 nodes with code-names – Earth and Mars. Each containing the version of a users data. If version stamps are identical between nodes, there has not been a change.
  2. Example 1: Earth contains [earth: 3, mars 4], Mars contains [earth: 2, mars 4]. Because the version stamp aren’t identical there has been a change. Earth has been incremented by 1, which means the changes on Earth should be synced to Mars.
  3. Example 2: Earth contains [earth: 3, mars 4], Mars contains [earth: 2, mars 5]. Earth and Mars have both incremented by 1, this means we have a write-write conflict which needs resolving.
3. Peer-to-Peer distributions – Timestamps – Simply time-stamping the latest change in each node is another great approach. However there is one possible problem, the system times can go out of sync. In a busy application, milliseconds matter.

Map-Reduce

Intro
1. The name Map-Reduce originally referred to the proprietary software built by Google. There are now tonnes of open source solutions. One of the most popular being Hadoop.
2. Map-Reduce can be referred to as a pattern or a programming model. The name is inspired from the map and reduce operations in functional programming languages.
3. Map-Reduce in NoSQL is a way to process large quantities of aggregates running over a cluster, while still limiting network activity.
4. Map-Reduce operations can have a large amount of steps, however they also could have only two. Quite simply map and reduce.
5. Map-Reduce can be implemented in any programming language. However tailored tools like Apache Pig or Hive are often better choices.
Map-Reduce operations
1. The map operation reads aggregates and outputs key-value pairs of the data. The reduce operation is a function that takes multiple map outputs and combines their values.
2. The map operation only operates on a single aggregate at a time. The reduce function can only operates on maps that are created with the same key.
3. The map operation can use composite keys (multi parameter keys), this also allows the reduce functions to operate on a multiple-field keys.
All outputs from map tasks, are sent to a reduce function on a single node. This involves a lot of network traffic which can be optimised with combiner functions and parallelism.
1. Combiner functions – These cut down mapper data, via aggregation of data with the same key. These are essentially reducer functions, however they run on the same node as the mapper, minimising network traffic. A combiner functions output, must contain the same data structure as the input. Reducer functions do not.
2. Combination functions can also begin running, before the map process has finished, which adds extra flexibility.
3. Parallelism – This is the process of partitioning the mapper outputs, then using multiple reduce functions to process different map keys. The final results are then merged together (This is called shuffling)
Piping Map-Reduce tasks
1. Mad-Reduce operations can be composed into pipelines, the output of one reduce, can be the input to another map operation.
2. Sometimes it is useful to break down the Map-Reduce into multiple steps. It will make understanding the process far easier. It also gives you access to an intermediary state of the data, which you can save for later. Later uses could be, re-running the same operation, or a starting point for different operations.
3. The early stages are often the slowest and so are most valuable to save.
4. Saving data at the end of a single Map-Reduce operation should be done so using materialised views, these are covered above.
5. By using materialised views, it is also possible to do incremental updates on your data.

Conclusion

That pretty much sums up my very rough and sketchy overview of what NoSQL has to offer. Apologies if I got anything wrong, I am still learning myself. As I learn more about NoSQL and it’s different database types I hope to post more articles. Stay tuned.

Thanks for reading.

12 Love This

Simon Jakowicz

Just another blogger

NoSQL Essentials – Part 1 – Overview