Cloud Zone is brought to you in partnership with:

Seth is the CTO at NuoDB. His main areas of focus are on the administration, security and resource management models, automation and the tools that drive these pieces. Seth is a DZone MVB and is not an employee of DZone and has posted 42 posts at DZone. You can read more from them at their website. View Full User Profile

On Being Cloud-Scale

07.24.2013
| 4380 views |
  • submit to reddit

I did a webinar recently with Amazon. The focus was on what makes NuoDB different than other databases and why that means we’re so well-suited to run on AWS. I started off with a simple statement: NuoDB is a relational database designed to be cloud-scale.

Calling something “cloud-scale” is, umm, “fluffy.” Still, I like using this phrase for two reasons: it makes people think about which elements of scale matter, and it gives me the chance to frame what I believe modern systems need to provide. Even before I got to the next slide in my AWS presentation I was seeing questions pop-up in the chat asking me for a definition.

What does it mean to be cloud-scale?

To most people, I think the first thing cloud-scale means is supporting a scale-out (or “horizontal” scale) model. You need to run faster, or handle more throughput? Add more machines. This is the basic scaling approach of most cloud environments today, and is increasingly the way that we architect systems.

Being able to scale out, on its own, isn’t really enough. If you can’t do that on-demand then you can’t react to unexpected spikes or failures. This means that cloud-scale systems need to be extremely agile. Being agile means being self-aware, and being able to maintain availability. It’s also about being graceful in the face of failures (which, let’s face it, happen a lot in cloud environments).

Scale and agility are great, but as a system gets more complicated it must also become easier to work with or you won’t be able to exploit the promised benefits. That’s why cloud-scale systems need to be easy to work with and make it easy to solve problems. Ease of use is about the UI, obviously, but it’s also about enabling your developers to be productive. It means that common problems like data replication and automated management should “just work.”

I think there are also more tangible requirements to being cloud-scale: being secure out of the box, working across multiple data centers and having provisioning interfaces as first-class components, to me, are all needed to scale in a cloud environment. When we shipped our 1.0 product we phrased this as our 12 rules (a nod to Codd’s rules) that differentiate a Cloud Data Management System from a traditional RDBMS. Some of this is also reflected in NIST’s attempt to define a cloud.

Why don’t (most) traditional databases measure up?

Just as I think the primary cloud-scale requirement in most people’s minds is supporting a scale-out model, I think the primary limitation of traditional RDBMSs is an assumption on a scale-up (or “vertical” scale) model. Database architectures going back 20 or 30 years are designed around the relationship between the filesystem and in-memory structures and a local notion of ordering and lock-management (or deadlock handling). These designs work reasonably well on large servers, but fall apart pretty quickly when you try to scale out.

Three common approaches that try to provide scale-out capability are sharding,caching and explicit master-slave replication. We’ve discussed some of the trade-offs and pitfalls to these approaches on the blog, so I won’t re-hash the technical details here.

What I will add to those discussions is that these approaches all affect ease of use and agility of the system. You get parallelism or higher availability but at the cost of complexity and fragility. It also imposes on your developers, your operators or both. It gets harder to develop and deploy applications and more costly to manage your systems.

As you try to scale these approaches across data centers the rules of ACID break down. Typically you violate Consistency in one of several ways involving the word “eventual.” This often has an effect on how strictly the Isolation levels are maintained. In the case of shards you end up either limiting what a query can do or having to ignore Atomicity guarantees.

It’s not that these approaches are bad, per se. We need to solve problems on a daily basis, and we’ve gotten pretty clever about how to handle scale. These approaches were born out of necessity, however, and at a high cost to stability, agility and simplicity. What we think NuoDB does is give you cloud-scale in every sense of the term.

How does NuoDB address these requirements?

NuoDB is an elastic system. What that really means is that it’s designed to scale out on-demand, scale back when resources aren’t needed and be flexible in where it runs and how databases are managed. This is at the heart of what our technical founder was thinking when we he designed our architecture around a simple peer-to-peer process model.

A database is really just a collection of processes that are logically addressable as a single SQL service. You start by provisioning a host and then you can ask that host to take on transactional or durability workloads. Our caching and transaction modelmeans that you don’t shard and our durability design means that you don’t build any explicit replication processes. The database is always active, and always consistent. There is no master or “special” peer, so failure can happen anywhere and the database keeps running.

When you want to scale out a database to a new host, there’s no special configuration needed to simply start a process and expand your database. Likewise, when you want to contract (say you’ve finished some intense work and now there’s a lot less load on your database) it’s just a matter of shutting down any arbitrary processes from the running set. Taken together, migrating a live, running database just means starting new processes somewhere and then shutting down existing processes. This kind of agility allows for always-available live upgrades, data center migration and  other enterprise-class high-availability guarantees.

The database model also makes working with a live database easy. We showed this on Google Compute Engine where we scaled into the millions of transactions per second, on-demand. That app was written on a single laptop, and then run at scale with no changes to the application logic. Our management model makes automation easy too. That’s what our work on HP’s Project Moonshot was all about, and it brings me back to where I started.

I did a webinar with Amazon a few weeks ago because we’re working on the next generation of NuoDB, and pushing the definition of cloud-scale. This means providing the same simple, logical database view as you scale across regions. It means powerful automation that lets you express a database as a set of requirements and service-level agreements, and then using that to scale out on-demand. We’re testing at scale now, and we’re pretty excited about the results. If you want to get in on this then get in touch and let us know what the problems are that you need solved..

Published at DZone with permission of Seth Proctor, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)