NoSQL Zone is brought to you in partnership with:

Mitch Pronschinske is the Lead Research Analyst at DZone. Researching and compiling content for DZone's research guides is his primary job. He likes to make his own ringtones, watches cartoons/anime, enjoys card and board games, and plays the accordion. Mitch is a DZone Zone Leader and has posted 2576 posts at DZone. You can read more from them at their website. View Full User Profile

Open Source NoSQL Databases

02.23.2010
| 38922 views |
  • submit to reddit
For almost a year now, the idea of "NoSQL" has been spreading due to the demand for relational database alternatives.  Maybe the biggest motivation behind NoSQL is scalability.  Relational databases don't lend themselves well to the kind of horizontal scalability that's required for large-scale social networking or cloud applications, and ORMs can abstract away impedance mismatch only so much.  In other cases, companies just don't need as many of the complex features and rigid schemas provided by relational databases.  Most people are not suggesting that we all ditch the RDBMS, in fact, many companies don't really need to switch.  Relational databases will probably be necessary for many applications years and years from now.  In essence, NoSQL is a movement that aims to reexamine the way we structure data and draw attention to innovation in hopes of finding the solution to the next generation's data persistence problems.  

Here are some of the better known open source data stores/models labeled as "NoSQL":

CouchDB - Document Store

  • Maps keys to data
  • It provides a RESTful JSON API and is written in Erlang
  • You can upload functions to index data and then you can call those functions
  • Has a very simple REST interface
  • Provides an innovative replication strategy - nodes can reconnect, sync, and reconcile differences after being disconnected for long periods of time  
  • Enables new distributed types of applications and data

MongoDB - Document Store

  • Free-form key-value-like data store with good performance
  • Powerful, expansive query model
  • Usability rivals that of Redis
  • Good for complex data storage needs.
  • Production-quality sharding capabilities

Neo4j - GraphDB

  • Disk-based
  • Has a restricted, single-threaded model for graph traversal
  • Has optional layers to expose Neo4j as an RDF store
  • Can handle graphs of several billion nodes, relationships, or properties on a single machine
  • Released under a dual license - free for non-commercial use 

Apache Hbase - Wide Column Store/Column Families

  • Built on top of Hadoop, which has functionality similar to Google's GFS and MapReduce systems
  • Hadoop's HDFS provides a mechanism that reliably stores and organizes large amounts of data
  • Random access performance is on par with MySQL
  • Has a high performance Thrift gateway
  • Cascading source and sink modules

Redis
- Key Value/Tuple Store

  • Provides a rich API and does more operations in memory, using disk only periodically.
  • It's extremely fast
  • Lets you append a value to the end of a list of items that's already been stored on a key.
  • Has atomic operations, making it a best-of-breed tally server.

Memcached - Key Value/Tuple Store

  • High-performance, distributed memory object caching
  • Free and open source
  • Generic and agnostic to the objects/strings it caches
  • It's all in-memory data
  • Simple yet elegant design enables easy development and deployment
  • Language neutral caching scheme.
  • Most of the large properties on the web are using it now, except for Microsoft

Project Voldemort - Eventually Consistent Key Value Store

  • Used by LinkedIn
  • Handles server failure transparently
  • Pluggable serialization supports rich keys and values including lists and tuples with named fields
  • Supports common serialization frameworks including Protocol Buffers, Thrift, and Java Serialization
  • Data items are versioned
  • Supports pluggable data placement strategies
  • Memory caching and the storage system are combined

Tokyo Cabinet and Tokyo Tyrant - Key Value/Tuple Store

  • Supports hashtable mode, b-tree mode, and table mode
  • It's fast and straightforward
  • Good for small to medium-sized amounts of data that require rapid updating and can be easily modeled in terms of keys and values

Cassandra
- Wide Column Store/Column Families

  • First developed by Facebook
  • SuperColumns can turn a simple key-value architecture into an architecture that handles sorted lists, based on an index specified by the user.
  • Can scale from one node to several thousand nodes clustered in different data centers.
  • Can be tuned for more consistency or availability
  • Smooth node replacement if one goes down
____

Some other well known NoSQL-style data stores that are closed source include Google BigTable and Amazon SimpleDBGigaSpaces is a popular space-based Grid solution that has NoSQL qualities.

Check out this informative post on NoSQL patterns.

Comments

Michele Mauro replied on Tue, 2010/02/23 - 10:31am

Why nobody ever remember BerkleyDB? Even if Oracle acquired Sleepycat years ago, the code is still GPL. It's a fast and respected key/value store that's almost old enough to drink...

Michele Mauro

Yehuda Hofri replied on Tue, 2010/02/23 - 2:24pm

I'd mention GigaSpaces, memcached and skip the rest ;-)

Rob Tweed replied on Wed, 2010/02/24 - 7:14am

Also add GT.M, particularly now it's accessible via the M/Wire protocol

Nati Shalom replied on Thu, 2010/03/11 - 8:06pm

Mitchel Great coverage of the topic.. Those interested in common principles behind the various NoSQL alternatives may find the the previous Dzone interview useful as well: Common principles behind NOSQL

Mitch Pronschinske replied on Wed, 2010/03/31 - 11:35am in response to: Nati Shalom

Thank you, Nati.  I've read your NOSQL interview and it has a lot of great detail.

Johannes Ernst replied on Fri, 2010/04/02 - 4:42pm

InfoGrid is another open-source graph database. It has dynamic typing, and a choice of storage backends.

Mitch Pronschinske replied on Thu, 2010/04/08 - 9:10am in response to: Johannes Ernst

Glad to see more NoSQL on the Graph front!  I think they're set for an explosion as well.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.