New storage products rarely generate as much as enthusiasm as we've seen with VSAN. That’s good.
But I’ve been dismayed to see industry commentary where VSAN gets arbitrarily lumped in with either (a) a gaggle of software-based storage products, or (b) some of the newer software-clothed-in-hardware products.
From my perspective, that’s not ideal. Something important is getting lost in translation.
I see strong, relevant architectural differences between VSAN and everything else that’s out there today.
Maybe those differences are important to people, maybe not — but the distinctions need to be understood and appreciated to be intelligently debated.
So let’s dig in …
If this whole VSAN thing is new to you, I’ve written a few posts that’ll bring you up to speed. Want to go deeper? There’s a ton of deep technical content out there from bloggers around the world. And there will be a VSAN "special online event" on March 6th. The beta has been long and successful, GA is promised for Q1, which would be before the end of March.
The source of the enthusiasm is clear: VSAN is a new kind of storage solution, targeted at a new storage buyer. It establishes a very different model for storage. While this is all well and good, it does tend to cause some cognitive dissonance with people who are deeply immersed in storage technology.
It All Starts With Being Hypervisor Converged
Although it is a separately licensed product, there is no separate distribution. Unlike traditional storage, it won't run with anything other than vSphere anytime soon. The depth and level of integration takes a while to fully appreciate, but it's amazingly deep and well-thought out.
VSAN runs with VMware only? Compared against more familiar storage, that might appear to be limiting at first glance. Consider the context ...
Let's assume that the majority of workloads today run virtualized, and the proportion will continue to grow. I think it's also safe to assume that -- for most enterprise IT shops -- this becomes primarily a VMware discussion.
If any architectural advantages result from being deeply integrated with the hypervisor (and I will argue here that there are several), software storage products that run under a guest OS in a VM simply won’t have those advantages.
That’s true whether they are delivered as software alone, or wrapped up in an appliance of some sort.
Just to be clear, I’m explicitly comparing the VSAN architecture against software-only storage stacks (HP’s VSA, Maxta, ScaleIO, Nexanta, Atlantis, etc.) as well as the newer breed of software-encased-in-hardware appliances (Nutanix, SimpliVity, etc.). Let's not forget several stealth startups scurrying around for VC money.
It's a long list of players, and the arguments I present here apply to all of them.
All of them have the same thing in common: their storage stacks run in a guest VM, and aren't an integral part of the hypervisor. From that seemingly minor implementation detail, serious implications result.
Given the vigorously competitive nature of this market, I fully expect heated rebuttals to my points. I encourage that; I'd disappointed by anything less!
#1 — Server Efficiency
The numbers we've seen so far don't lie: you get more bang-for-your-buck in terms of compute and memory resources used to provide storage services when using this approach.
Most software storage stacks that run in a guest OS have to be fully provisioned up front: all the memory, all the CPU, etc. Yuck.
Contrast that with VSAN: it uses less than 10% of available CPU, and a modest amount of memory that scales with capacity. Compare this with the somewhat more outsized requirements you’ll find associated with many of the software-ish alternatives presented above.
The counter argument is that — hey, compute and memory are cheap. While this is certainly true, perspectives differ — I certainly have met customers who are very concerned about the efficiency of their server environment.
#2 — More Performance, and More Predictable Performance
There’s no getting around it: unless your storage software is deeply integrated with the kernel, you’re always going to be traversing one or more VMs for each and every storage IO. That doesn’t exactly help when it comes to latency. Our initial testing shows that — yes — storage software stacks that run in guest OSes tend to uniformly suffer in this regard.
1. Application makes IO request
2. Traverse the guest OS IO stack
3. Traverse the hypervisor IO stack to reach the storage software
4. Traverse the storage software VM's stack as well as its guest OS
5. Traverse the hypervisor IO stack to do the physical IO
6. And all the way back again!
When storage software is implemented as an integral part of the hypervisor, you avoid traversing #3 and #4 twice. For every IO.
Digging a bit deeper, you quickly realize that the protocol between application client and storage services doesn't have to be iSCSI, NFS, etc. As both endpoints are native to vSphere, you can build a very efficient, optimized storage protocol that avoids the overheads associated with traditional IO presentation stacks.
Which is exactly what VSAN does.
A self-contained storage stack running under a guest OS has to compete for resources alongside other application workloads.
While there are good QoS services in the hypervisor, they're designed for application workloads, and not designed to handle storage software workloads. What features do exist would have to be explicitly managed.
Hypervisor-converged storage stacks have a distinct advantage here. First, all IO services are provided by the kernel, and not a user VM. Big potential win for latency, and perhaps bandwidth.
Second, the hypervisor knows about VSAN, and vice-versa. Critical storage operations are prioritized ahead of user tasks if required. That’s particularly important when there’s a transient workload (e.g. a rebuild) or similar.
If you're using a software storage stack running neatly in a VM, you don't want that VM to be starved when you need it the most.
When I talk to infrastructure pros, they routinely tell me that predictability (“no surprises!”) is what they really want. Architecturally, VSAN has an undeniable advantage here -- simply because it is hypervisor converged.
#3 — Availability and Recoverability Semantics
This will end up being a hotly debated topic, once you get beyond simplistic failure scenarios. My view stems once again from hypervisor convergence, and being deeply integrated with the hypervisor.
That means its recovery logic has to be fairly binary — basically, “I’ve fallen and I can’t get up”, so please fail me over to another node, disk, etc.
A storage stack that’s deeply integrated into the kernel has a much wider palette of potentially more sophisticated responses.
For example, it can directly probe the status of the physical hardware through native device drivers: disks, CPU,memory, network, etc. It understands the entire cluster, and all of its resources. It can intelligently balance recovery activities with application processing.
Anything running in a guest OS -- by design -- has all this abstracted away on its behalf.
If you're following me, you can certainly see the architectural potential for more nuanced and sophisticated recovery logic. And-- in the gritty real world -- failure recovery semantics do matter.
#4 — Management That’s Built-In, Not Bolted On
All standalone storage products face the same challenge: their management interfaces have to be exposed through one or more plug-ins.
If you watch a VM admin using one or more storage plug-ins, you’ll see a lot of flipping back and forth between the plug-ins and the rest of the environment. There’s a lot of correlation: find an object on one side, find it on the other side, make sure it’s the same thing, make the connection, etc.
You don’t see that with VSAN. Everything is logically and consistently presented in the context of the workflow. That directly results from being an integral part of the hypervisor.
This might not sound like a big deal, unless you’re a time-compressed VM admin. And, not surprisingly, this is one of the things they like best about VSAN. More importantly, all the cool VMware bits just work as expected :)
Architecturally, you can’t reasonably achieve this result unless you’re deeply integrated into not only the kernel, but the various tools and management views as well.
#5 — Aligning Around Applications, Not LUNs
VM admins think in terms of applications. Storage admins think in terms of storage containers, like LUNs and filesystems. The two don’t easily align. This mismatch can be frustrating, and introduces needless complexity and inefficiency.
Both parties would like an easier way to transact business ...
We’ll have to wait a while before we’ll see this challenge addressed for external arrays (VASA 2.0 and VVOLs), but today's VSAN gives a great preview. If we’re being strict here, this attribute is not unique to being hypervisor converged, but it does illustrate the magic that can result through deep integration.
It’s a deceptively simple mechanism.
The administrator defines policy, using a template, which — of course — includes storage policy: capacity, performance and availability today. That policy is pushed down to to the storage layer, and a container is dynamically provisioned that precisely aligns on application boundaries, and delivers the exact policy requested.
Compare that with the bottoms-up, storage-centric approach, and it’s not nearly as elegant.
Storage administrators usually pre-carve separate pools with distinct characteristics, and publish them. Applications then consume from the pre-existing storage containers. Matching supply and demand is imperfect at best, resulting in the familiar “have a hunch, provision a bunch” approach.
Digging a bit deeper, dynamically changing policies isn’t straightforward with the traditional approach, nor is verifying compliance.
The beauty of this approach is the ability to easily view all resources (storage, compute, network, etc.) from the point of view of the application container (the set of relevant VMs) vs. trying to reconstruct an accurate picture from the bottom of the infrastructure upwards.
Simply point at the applications you're interested in, and everything you need is right there, organized as your users see them -- by application.
Again, architecturally very difficult for a non-hypervisor-converged storage stack to achieve.
VSAN Is Different
While I am a big fan of strong and detailed comparisons between competing technologies, I’m disappointed when surface comparisons miss the deeper zeitgeist behind the technologies.
That’s understandable: products like VSAN are relatively new ideas, and the tendency is to lump everything into big, familiar categories where everything appears similar.
However -- in this particular case — things are quite different indeed.
I think that people will quickly come to appreciate the architectural benefits that result from storage services being directly integrated into the hypervisor. Hypervisor convergence is not a buzzword, it’s a meaningful distinction in my opinion.
Are these differences relevant or not? That will always be in the eye of the beholder …
Like this post? Why not subscribe via email?