上QQ阅读APP看书，第一时间看更新

Graph databases

Last but not least, and of course the subject of most of this book, are the graph-oriented databases. They are often also categorized in the NoSQL category, but as you will see later, they are inherently very different. This is not the case in the least, because the task-orientation that graph databases are aiming to resolve has everything to do with graphs and graph theory. A graph database such as Neo4j aims to provide its users with a better way to manage the complexity of the dense network of the data structure at hand.

Implementations of this model are not limited to Neo4j, of course. Other closed and open source solutions such as AllegroGraph, Dex, FlockDB, InfiniteGraph, OrientDB, and Sones are examples of implementations at various maturity levels.

So, now that we understand the different types of NoSQL databases, it would probably be useful to provide some general classification of this broad category of database management systems, in terms of their key characteristics. In order to do that, I am going to use a mental framework that I owe to Martin Fowler (from his book NoSQL Distilled) and Alistair Jones (in one of his many great talks on this subject). The reason for doing so is that both of these gentlemen and share my viewpoint that NoSQL essentially falls into two categories, on two sides of the relational crossroads:

On one side of the crossroads are the aggregate stores. These are the Key-value, Column-family, and Document-oriented databases, as they all share a number of characteristics:
- They all have a fundamental data model that is based around a single, rich structure of closely-related data. In the field of software engineering called domain-driven design, professionals often refer to this as an aggregate, hence the reference to the fact that these NoSQL databases are all aggregate-oriented database management systems.
- They are clearly optimized for use cases in which the read patterns align closely with the write patterns. What you read is what you have written. Any other use case, where you would potentially like to combine different types of data that you had previously written in separate key-value pairs / documents / rows, would require some kind of application-level processing, possibly in batch if at some serious scale.
- They all give up one or more characteristics of the relational database model in order to benefit it in other places. Different implementations of the aggregate stores will allow you to relax your consistency/transactional requirements and will give you the benefit of enhanced (and sometimes, massive) scalability. This, obviously, is no small thing if your problem is around scale, of course. Use the following signs:

Relational crossroads, courtesy of Alistair Jones

On the other side of the crossroads are the graph databases, such as Neo4j. One could argue that graph databases actually take relational databases as follows:
- One step further, by enhancing the data model with a more granular, more expressive method to store data, thereby allowing much more complex questions to be asked of the database, and effectively, as we will see later, demining the join bomb
- Back to its roots, by reusing some of the original ideas of navigational databases, but of course learning from the lessons of the relational database world by reducing complexity and facilitating easy querying capabilities

With this introduction and classification behind us, we are now ready to take a closer look at graph databases.