There’s plenty of enthusiasm around VSAN. Many folks are comparing and contrasting what can now be done in the hypervisor vs. familiar storage arrays. And it’s fun to watch the discussion and debates unfold.
While the purists might debate theory, most IT shops are far more pragmatic. It can’t be an either-or proposition of one or the other — it has to be both.
Besides the obvious fact that IT shops are heavily invested in storage array tech, there are certain things that storage arrays do very well.
But at the same time, people are certainly intrigued with VSAN: the performance, the management model, the compelling economics, etc.
I’ve started to see a few smart types figure out clever ways to combine VSAN’s strengths with those of external storage arrays — a design pattern I’ve started to dub “VSAN plus”, as in “VSAN plus (insert name of familiar storage array)”.
And, from what I can tell so far, it’s a big win-win-win all around.
Storage Arrays Do Have Their Strengths
It’s hard to argue against the advantages of a modern external storage array. I’ve now spent twenty years in and around them — starting with the very first DG Clariion! — so it’s not new territory for me.
My personal list of strong points starts with a very rich set of mature data services: snaps, remote replication, deduplication, encryption, tiering, etc. Storage vendors have been working hard for many years to differentiate their products by offering unique data services: functionality, scale, performance, auditing, etc.
This stuff really matters — especially when we’re considering important or critical applications.
There’s more: IT shops have invested an enormous amount of resources in these arrays; not just the devices themselves, but the skills and know-how required to use them effectively. And that’s before we consider the substantial investment in operational processes and workflows around their capabilities.
There’s a long list of more debatable strong points: density, efficiency, serviceability, etc. As I’m sure these will be thrashed out elsewhere, I’m going to stick with (a) rich data services, and (b) massive customer investment for purposes of this particular discussion.
That being said, ask most IT pros, and you’ll hear the same two things over and over again: (1) external storage arrays can be difficult to manage — especially from application-centric perspective — and (2) they can be expensive.
VSAN Brings Something New To The Party
Hypervisor-converged storage approaches storage from a very different model: extending the hypervisor to use server-based disks and flash to do many of the things once restricted to external storage arrays.
If we’re looking and strengths, first and foremost is the management model; storage is now managed as a simple extension of day-to-day workflows for the VM admin team.
For example, I consider VSAN’s application-centric policy model breathtaking in its simplicity.
And while numbers will be endlessly debated, VSAN clearly offers a lot of bang-for-the buck in terms of both capacity and performance. As VSAN is fully integrated into the hypervisor — and uses internal server storage capacity — it follows a very different price/performance curve than traditional storage arrays -- or other software-based storage stacks for that matter.
That cost efficiency can be very appealing — once you sit down and work out your own numbers.
The “VSAN Plus” Pattern
So, how does one combine the strengths of each in a typical scenario?
Conceptually, it’s straightforward: the most critical data stores that demand rich data services (e.g. high-performance snaps, advanced replication, data-at-rest encryption, compliance auditing, etc.) go on the external storage array.
And most everything else is a candidate to go on VSAN. The array does what it is good at, VSAN is used for what it’s good at. The specifics and terminology will vary depending on the nature of the application(s) being considered, but the pattern is the same.
A good example is VDI. Users care most about their data store — their personal files — so that may go on a shared NAS device with a rich set of data services. The VDI images themselves, scratch spaces, swap, etc. — all go on VSAN.
In this type of configuration, far less load is driven to the shared NAS device. Array services (snaps, replication, encryption, dedupe, etc.) are very good. On the other hand, VSAN’s flash caching and local storage delivers great performance for moderate cost; and — importantly — VSAN resources are now managed by the VDI team, not the storage team.
Another good example might be an Oracle-based application. The primary datastore — the database — will want to run on a proven external array, and frequently use many of the data services there.
But there’s more to the environment than just the primary database. There’s test and development. Decision support and OLAP queries. Scratch and temp spaces. FGA used for flash-back queries. And more.
Here’s the point: there’s a lot of storage capacity associated with an Oracle application landscape — above and beyond the primary data store — that may not need that suite of rich data services. Instead, what’s desired is great-performing storage that’s very cost-effective and dead-simple to manage for the virtualization team.
Which is exactly what VSAN does.
A more interesting example that’s popped up is Hadoop. No, I’m not recommending that VSAN be used for the primary HDFS datastore (yet!), but — consistent with the pattern — there’s more storage involved in the environment.
In particular, the MapReduce phase beats on storage in a very cache-friendly way, making VSAN very attractive from a price/performance perspective. The profile is very different than what’s required for the primary data store.
Couple VSAN with VMware’s popular BDE (Big Data Extensions) for Hadoop, and you’ve got a great self-service cluster-as-a-service capability that now includes dynamically provisioned temporary storage.
There are more examples, but the pattern is largely the same. Most larger application landscapes use multiple datastores: SAP, Microsoft, etc. Figure out which datastores are critical: those go on the external arrays and fully replicated, etc. Everything else is a candidate for VSAN.
What’s Different Here?
The idea of tiering storage behind larger application environments is nothing new, really. One could argue that we’re now just doing it with two distinct storage architectures: familiar storage arrays, and newer hypervisor-converged storage stacks such as VSAN.
What’s really different here is the management model — it’s now partitioned in a way that makes sense for everyone involved.
Previously, the VM team usually had to go to the storage team for everything: important stuff, less important stuff, etc.
No matter how responsive the storage team can be, there's inevitably an extra set of handoffs in the process -- and that creates inefficiency.
In this new model, storage management is now partitioned and better aligned between the two teams.
If you think about it, in each of these cases, the storage team manages the capacity associated with the important stuff: user files, the database, the HDFS store, etc. And they use a storage-centric management model to get their job done.
However, all the less-critical stuff: binaries, temporary spaces, logs, re-creatable data stores, etc. — those are now being managed by the virtual infrastructure team — and fully integrated with the workflows they use.
Underneath the covers, a vSphere capability known as SPBM (storage policy based management) combined with VASA 1.0 virtual array storage APIs) makes it very easy for VM administrators to land the right workload on the right device — be it an external array, or internally on VSAN.
Today’s VASA when used with external arrays creates a simple yet straightforward model. Storage admins create pools of protected capacity (the quintessential metals: platinum, gold, silver, etc.) which represent classes of service available for consumption.
As part of the provisioning process, the VM admin creates storage profiles that match with what’s available. When an application is provisioned, it’s matched with a pre-existing category.
But not all storage services have to be delivered by external storage arrays. New storage profiles can be created, e.g. (“internal on VSAN”) and then further specified by amount of cache to use, how to protect against failure, striping, etc.
To complete the picture, Storage vMotion can be used to easily and non-disruptively move VMs as needed between storage pools: whether they be resident on internal or external storage.
A New Style Of Storage Tiering?
Years ago when I was at EMC, we strongly pushed the idea of storage tiering. It started with advising customers to segment their storage workloads, and select the right architecture for the task at hand. We made up unofficial categories as a shorthand -- tier 1, tier 2, etc. -- to describe different mixes of performance and protection.
Yes, this introduced some complexity in customer environments, but it also saved them a bunch of money. Later, as array technology evolved, this modified to include “tiering-in-the-box” — which became very popular.
I could now argue we have a new style of storage tiering to consider -- one that segments by architectural lines: hypervisor-converged storage and external storage.
To be clear, both are capable of providing multiple "tiers" of storage in the traditional sense. What's different is the management model, and -- depending on your circumstances -- price-performance.
The new management model now segments along natural boundaries: storage that’s associated with day-to-day needs is provisioned and managed by the same team that does so for compute, memory, server images, etc. And storage that contains the family jewels continues to be managed by the same storage professionals as before.
Everyone does what they’re good at.
The Case For Using VSAN With External Storage
Just by creating a server-resident layer of shared storage (using VSAN as an example here) there’s quite list of benefits that result from this approach:
External storage arrays can get a performance boost, as they’re now off-loaded.
Existing storage array assets may last longer between performance and capacity upgrades.
The storage team can now focus more on the important data stores vs. less-critical activities.
The VM admin team gets a convenient, easy-to-use storage resource that works the way they do.
Money is saved, agility is improved, alignment is improved, etc. — a win-win-win, from my perspective.
Just to be clear, I didn’t come up with this by myself. I saw some very smart and pragmatic customers going in this direction, and asked them about their thinking — which I’m sharing here.
Some Things Never Change
The good IT people I know how to quickly size up new technology, and decide how it can benefit their environment. That behavior seems consistent from decade to decade.
In the storage world, VSAN is new technology. And it seems that more than a few people are quickly figuring out how to leverage it in their existing environment.
------------------
Like this post? Why not subscribe via email?
Hey Chuck,
On this note: "There’s a long list of more debatable strong points: density, efficiency, serviceability, etc." I'm not sure where that logic comes from, if you could clarify I would appreciate it.
From what my team has seen VSAN is more dense in most if not all environments that can house at least two drives in a server (which is all rack mount and most blades). VSAN is more efficient in the respects that really matter: space & cost. Yes, it consumes more CPU cycles on servers but why should we care how many CPU cycles it consumes if the combined cost is lower than the traditional storage array?
And from what we have seen I would argue that VSAN is far more serviceable. As it has the ability to tolerate a configurable number of failures, this protects us from the occasional outage caused by addition or removal of disk shelves, instances where a controller goes down while performing maintenance on a second controller and a potential protection from instances of rack/row power failures. This also makes it very easy to move the storage platform around within the datacenter without taking service outages.
I'm curious what the other opinions are out there and perhaps that we have overlooked something
--Adam Sekora
@vdoubleshot
Posted by: Adam Sekora | March 19, 2014 at 11:22 PM
Hi Adam
Keep in mind, I've spent almost 20 years in the storage array world, so I very much understand and appreciate the perspectives.
Density: it's hard for server packaging to match the density of efficient array packaging -- if one considers storage by itself.
However, I've met more than a few people who are convinced that when all three disciplines are considered (storage, compute, network), that better density results from server nodes with embedded storage, as you do.
I would agree that combined cost should be the desired metric. However, everyone looks at the numbers a bit differently, so I don't go there. The only costs that matter are the ones you see.
I think the protection and serviceability aspects will be debated for a while. There are two major models in play, and each look at the world differently.
You don't sound like you've overlooked anything, unless I'm missing something. Understanding more about your environment, goals, philosophies, etc. would help be be more specific, but I'm flying blind here.
If you'd like to discuss more, please drop me a note at chollis@vmware.com
Thanks!
-- Chuck
Posted by: Chuck Hollis | March 19, 2014 at 11:44 PM
Good article and we believe VSAN is a game changer, and as in all things, it depends on your business requirements and the business problem we are trying to solve. We think this article is spot on and we like the idea of "VSAN Plus"-- using the right tool for the right job. VSAN brings new innovation for solving certain modern storage business issues at a better TCO for some Use Cases. Also the traditional SAN will still do what it does best for the enterprise. So it is all good and this is a win win for enterprise storage solutions.
Posted by: Wences Michel | March 19, 2014 at 11:50 PM
Disclosure - Chad here, and I'm an EMCer.
Chuck - I think you're spot on, and for one, I'm talking about the use of hyper-converged models like VSAN as part of the "persistence" universe and fit for workload with almost every customer I talk to. It adds another compelling choice. The more we can make SPBM (and ViPR for cases where that catalog of choices must be > vSphere based workloads) act as an abstractor of capability, the better.
Posted by: Chad Sakac | March 20, 2014 at 12:17 PM
Chuck,
Your tiering examples are in line with what I am thinking but perhaps we can be a bit more specific within the context of VMDK's. Today, I would largely imagine that most customers of VMWare are provisioning a monolithic VMDK for a single guest. i.e. if a requestor of a vm says they need 500gb for the app, a 500GB vmdk gets provisioned. What is VMware's stance on perhaps this model changing were a vm should be comprised of multiple vmdk's? Therefore, a given vm is made up of a "master vmdk" that holds the root filesystem and app binaries and then secondary vmdk's from an external tier where the bulk of the storage is actually required? This master vmdk would reside in VSAN where things like transient logging, etc.. (/tmp, pagefile, etc...) and be of relatively the same size within the VSAN layer. This sort of mimics what we do in the physical bare metal world where root volumes are on local disk and we allocate SAN luns for the data. In effect, should we not manage vmdk's like we manage luns?
Posted by: Jae Kim | March 20, 2014 at 12:38 PM
Hi Chad -- agreed!
Posted by: Chuck Hollis | March 20, 2014 at 02:28 PM
Hi Jae
You bring up good points. We probably owe people guidance and documentation on how best to split things up in order to achieve what's being described here. Consider it a work in progress :)
-- Chuck
Posted by: Chuck Hollis | March 20, 2014 at 02:57 PM