As part of the vSphere 6.0 announcement festivities, there’s a substantially updated new version of Virtual SAN 6.0 to consider and evaluate.
Big news in the storage world, I think.
I have been completely immersed in VSAN for the last six months. It's been great. And now I get to publicly share what’s new and — more importantly — what it means for IT organizations and the broader industry.
If there was a prize for the most-controversial storage product of 2014, VSAN would win. In addition to garnering multiple industry awards, it’s significantly changed the industry's storage discussion in so many ways.
Before VSAN, shared storage usually meant an external storage array. Now there’s an attractive alternative — using commodity components in your servers, with software built into the world’s most popular hypervisor.
While the inevitable “which is better?” debate will continue for many years, one thing is clear: VSAN is now mainstream.
This post is a summary of the bigger topics: key concepts, what’s new in 6.0, and a recap of how customers and industry perspective has changed. Over time, I’ll unpack each one in more depth — as there is a *lot* to cover here.
Virtual SAN extends the vSphere hypervisor to create a cluster-wide shared storage service using flash and disk components resident in the server.
It does not run “on top of” vSphere, it’s an integral component. All sorts of design advantages result.
In addition to a new hardware and software model, VSAN introduced a new management model: policy-based storage management. Administrators specify the per-VM storage policy they’d like (availability, performance, thin provisioning, etc.), and VSAN figures out the rest. Change the policy, VSAN adapts if the resources are there.
The marketing literature describes it as “radically simple”, and I’d have to agree.
During 2014, VSAN 5.5 exceeded each and every internal goal we set out: performance, availability, reliability, customer adoption, roadmap, etc. A big congratulations to a stellar team!
Of course, now we need to set much more aggressive goals :)
What’s New
Quite a bit, really. The engineering team is accelerating their roadmap, and I couldn’t be more pleased. All of this should be available when vSphere 6.0 goes GA.
All-flash
VSAN 6.0 now supports all-flash configurations using a two-tiered approach. The cache tier must a be write-endurant (e.g. not cheap) flash device; capacity can be more cost-effective (and less write-endurant) flash.
With all-flash configurations, cache is not there to accelerate performance; it’s there to minimize write activity to the capacity layer, extending its life.
Note: IOPS quoted here are 4K, 70r/30r mixes. As always, your mileage may vary.
Performance is extreme, and utterly predictable as you’d expect. Sorry, no dedupe or compression quite yet. However -- and this is a big however -- the write-caching scheme permits the use of very cost-effective capacity flash without burning it out prematurely. Everyone's numbers are different, but it's a close call as to which approach is more cost-effective: (a) more expensive capacity flash with dedupe/compression, or (b) more inexpensive capacity flash without dedupe/compression.
Please note that all-flash support is a separate license for VSAN.
New file system, new snaps
Using the native snapshots in vSphere 5.5 was, well, challenging. A new on-disk filesystem format is introduced in VSAN 6.0 that’s faster and more efficient, derived from the Virsto acquisition. Snaps are now much faster, more space-efficient and perform better when used -- and you can take many more of them. Up to 32 snaps per VM are now supported, if you have the resources.
VSAN 5.5 users can upgrade to 6.0 bits without migrating the file format, but won’t get access to some of the new features until they do. The disk format upgrade is rolling and non-disruptive, one disk group at a time. Rollbacks and partial migrations are supported. Inconvenient, but unavoidable.
Bigger and better — faster, too!
Support is included for VMDKs up to 62TB, same as the vSphere 6.0 max. VSAN clusters can have as many as 64 nodes, same as vSphere 6.0.
The maximum number of VMs on a VSAN node is now 200 for both hybrid and all-flash configs (twice that of VSAN 5.5), with a new maximum of 6400 VMs per VSAN cluster.
More nodes means more potential capacity, up to 64 nodes, 5 disk groups per node and 7 devices per disk group, or 2240 capacity devices per cluster. Using 4TB drives, that’s a humble ~9 petabytes raw or so.
The marketing statement is that VSAN 6.0 in a hybrid configuration (using flash as cache and magnetic disk for capacity) offers twice the performance than VSAN 5.5. I want to come back later and unpack that assertion in more detail, but VSAN 6.0 is noticeably faster and more efficient for most workloads. Keep in mind that VSAN 5.5 was surprisingly fast as well.
And, of course, the all-flash configurations are just stupidly fast.
Go nuts, folks.
Fault domains now supported
VSAN 5.5 was not rack-aware, VSAN 6.0 is. When configuring, you define a minimum of 3 fault domains to represent which server is in which rack. After this step, VSAN will be smart enough to distribute redundancy components across racks (rather than within racks) if you tell it to.
Note: this is not stretched clusters — yet.
New VSAN Health Services
Not surprisingly, VSAN is dependent on having the correct components (hardware, driver, firmware) as well as a properly functioning network. The majority of our support requests to date have been related to these two external issues.
VSAN 6.0 now includes support for a brand new Health Services tool that can be used to diagnose most (but not all) of the external environmental issues we've encountered to date. Log collection is also simplified in the event you need VMware customer service support.
A must for any serious VSAN user.
Operational improvements
Lots and lots of little things now make day-to-day life with VSAN easier.
The UI now does a better job of showing you the complete capacity picture at one go. There’s a nifty screen that helps you map logical devices to physical servers. You can blink drive LEDs to find the one you’re interested in. Individual capacity drives or disk groups can be evacuated for maintenance or reconfiguration. There’s a new proactive rebalancing feature.
A default storage policy is now standard, which of course can be edited to your preferences. There’s a new screen that shows resynchronization progress of rebuilding objects. There’s a new “what if” feature that helps you decide the impact of a new storage policy ahead of time.
For larger environments, vRealize Automation and vRealize Operations integration has been substantially improved — VSAN is now pre-configured to raise selected status alarms (VOBs) to these tools if present.
And much more.
Behind the scenes
There’s been a whole raft of behind-the-scenes improvements that aren’t directly visible, but are still important.
VSAN is now even more resource-efficient (memory, CPU, disk overhead) than before, allowing higher consolidation ratios, among other things. VSAN’s resource efficiency is a big factor for anyone looking at software-delivered storage solutions as part of their virtualized environment, helping folks achieve even higher consolidation ratios.
The rebuild prioritization could be a bit aggressive in VSAN 5.5; it now plays much more nicely with other performance-intensive applications. Per-node component counts have been bumped up from 3000 to 9000, and there’s a new quorum voting algorithm that uses far fewer witnesses than before. As a result, there’s much less need to keep an eye on component usage.
VSAN 6.0 requires a minimum of vCenter 6.0 and ESXi 6.0 on all hosts. As mentioned before, you can defer the file system format conversion until later, but no mix and matching other than that.
New ReadyNodes
If you’d like to shortcut the build-your-own experience, that’s what VSAN ReadyNodes are for. Many new ones will be announced shortly, some with support for hardware checksums and/or encryption.
ReadyNodes can either be ordered directly from the vendor using a single SKU, or use them as a handy reference in creating your own configurations.
Or skip all the fun, and go directly to EVO:RAIL
See something missing?
Inevitably, people will find a favorite feature or two that's not in this release. I have my own list. But don't be discouraged ...
I can't disclose futures, but what I can point to is the pace of the roadmap. VSAN 5.5 has been out for less than a year, and now we have a significant functionality upgrade in 6.0. Just goes to show how quickly VMware can get new VSAN features into customers' hands.
The Customer Experience
By the end of 2014, there were well over 1000 paying VSAN 5.5 customers. Wow. Better yet, they were a broad cross-sample of the IT universe: different sizes, different industries, different use cases and different geographies.
For the most part, we succeeded in exceeding their expectations: radically simple, cost-effective, blazing performance, reliable and resilient, etc.
One area we took some heat on with VSAN 5.5 was being a bit too conservative on the proposed initial use cases: test and dev, VDI, etc.
Customers wondered why we were holding back, since they were having such great experiences in their environment.
OK, call us cautious :)
With VSAN 6.0, there are almost no caveats: bring on your favorite business-critical workloads with no reservations.
Another area where we’re working to improve? In the VSAN model, the customer is responsible for sizing their environment appropriately, and sourcing components/drivers/firmware that are listed on the VMware Compatibility Guide (VCG) for VSAN.
Yes, we had a few people who didn’t follow the guidelines and had a bad experience as a result. But a small number of folks did their best, and still had some unsupported component inadvertently slip into their configuration. Not good. Ideally, we’d be automatically checking for that sort of thing, but that’s not there — yet.
So the admonishments to religiously follow the VSAN VCG continue.
If I’m being critical, we probably didn’t do a great job explaining some of the characteristics of three-node configurations, which are surprisingly popular. In a nutshell, a three node config can protect against one node failure, but not two. If one node fails, there are insufficient resources to re-protect (two copies of data plus a witness on the third node) until the problem is resolved. This also means you are unprotected from a failure during maintenance mode when only two nodes are available.
Some folks are OK with this, some not.
Four-node configs (e.g. EVO: RAIL) have none of these constraints. Highly recommended :)
The Industry Experience
Storage — and storage technology — has been around for a long time, so there’s an established orthodoxy that some people adhere to. VSAN doesn’t necessarily follow that orthodoxy, which is why it’s disruptive — and controversial.
There was a lot of head-scratching and skepticism when VSAN was introduced, but I think by now most of the industry types have gotten their head wrapped around the concept.
Yes, it’s highly available. Yes, it offers great performance. No, the world won’t end because people are now using server components and hypervisor software to replace the familiar external storage array. And there is plenty of real-world evidence that it works as advertised.
However, a few red herrings did pop up during 2014 that are worth mentioning.
One thread was around why people couldn’t use any hardware that might be handy to build a VSAN cluster. The rationale is the same as why array vendors won’t let you put anything unsupported in their storage arrays — the stuff might not work as expected, or — in some cases — is already known not to work properly.
If you don’t follow the VCG (vSphere Compatibility Guide), we’re awfully limited in the help we can provide. And there are some truly shoddy components out there that people have tried to use unsuccessfully.
Another thread was from a competitor around the attractiveness of data locality.
The assertion was that it made performance sense to keep data and application together on the same node, with absolutely no evidence to support the claim.
Keep in mind that, even with this scheme, writes still have to go to a separate node, and any DRS or vMotion will need its data to follow. And that’s before you consider the data management headaches that could result by trying to hand place the right data on the right node all the time.
Hogwash, in my book.
VSAN creates a cluster resource that is uniformly accessible by all VMs. DRS and/or vMotions don’t affect application performance one bit. Thankfully, the competitor dropped that particular red herring and went on to other things.
A related thread was the potential attractiveness of client-side caching vs. VSAN’s server-side caching. A 10Gb network is plenty fast, and by caching only one copy of read data cluster-wide, it’s far more space efficient and thus there’s a much greater likelihood that a read request will come from cache vs. disk. Our internal benchmarks continually validate this design decision.
A more recent thread from the networking crowd wasn’t happy with the fact that VSAN uses multicast for cluster coordination, e.g. for all nodes to stay informed on current state. It’s not used to move data.
Do we have cases where customers haven’t set up multicast correctly? Yes, for sure. Once it gets set up correctly, does it usually run without a problem? Yes, for sure.
There was also the predictable warning that VSAN 5.5. was a “1.0” product, which was essentially a true statement. That being said, I’ve been responsible for bringing dozens of storage products to market over the decades, and — from my perspective — there was very little that was “1.0” about it.
And I’ve got the evidence to back it up.
Perhaps the most perplexing thread was the specter of the dreaded “lock in” resulting from VSAN usage. To be fair, most external storage arrays support all sorts of hypervisors and physical hosts, and Virtual SAN software is for vSphere, pure and simple.
Enterprise IT buyers are quite familiar with products that work with some things, but not others. This is not a new discussion, folks. And it seems like the vSphere-only restriction is AOK for many, many people.
The Big Picture?
Using software and commodity server components to deliver shared storage services is nothing new in the industry. We’ve been talking about this in various forms for many, many years.
But if I look back, this was never an idea that caught on — it was something you saw people experimenting with here and there, with most of the storage business going to the established storage array vendors.
With VSAN’s success, I’d offer this has started to change.
During 2014, VSAN has proven itself to be an attractive, mainstream concept that more and more IT shops are embracing with great success.
And with VSAN 6.0, it just gets even more compelling.
Chuck,
1st, huge fan of VSAN, FYI. However, some questions regarding all-flash configs and life of the flash devices.
1/How do higher endurance cache devices minimize and extend the life of capacity tier devices in all flash config?
2/Anything to help w/WL, OP, GC or endurance/PE cycles?
Thanks!
Posted by: Tom K | February 03, 2015 at 11:20 AM
Hi Tom
All-flash VSAN uses more expensive, write-endurant cache in front of less expensive (and less-write-endurant) capacity flash. Its only purpose is to extend the life of the capacity tier. It's not about performance.
The algorithms try to keep written data in cache as long as possible, and -- when it must be evacuated to accept new writes -- it's done in a way that minimizes wear (e.g. page at a time). This is a different approach than most read/write caches that are much more aggressive at destaging data from cache.
Our experience in engineering tests is that using 10% write cache in this manner means that (a) writes are kept in cache a surprisingly long time, and (b) there end up being very minimal writes to the capacity tier.
This means that -- in most situations -- you can get away with much more cost-effective capacity flash that doesn't burn out quickly as a result.
Now, understandably, there are workloads that do nothing but write data (e.g. data stream capture and similar), and these would require a more expensive (and write endurant) capacity tier, or use magnetic disks.
Hope this helps?
-- Chuck
Posted by: Chuck Hollis | February 03, 2015 at 01:00 PM
Loved this piece the best
" Another thread was from a competitor around the attractiveness of data locality.
The assertion was that it made performance sense to keep data and application together on the same node, with absolutely no evidence to support the claim. "
They should just be counting their days to extinction !!!
Posted by: A VSAN patriot | May 04, 2015 at 12:56 PM