One aspect of our industry that I find especially annoying is the "pay-to-say" analyst model. The usual scenario is that one vendor wants to discredit one or more other vendors to make themselves look better. They contract with a freelance analyst, who hopefully brings more expertise and the appearance of independence to the table.
The few analysts who use this model fiercely brand themselves "independent", perhaps in the sense that they are not affiliated with one of the big name industry analyst firms.
I guess by the same standard my lawyer is "independent", but I certainly pay for results!
Shortly after VMware announced VSAN's general availability, George Crump and Colm Keegan of Storage Swiss published a short piece entitled "The Problems With Server Side Storage, Like VSAN", which appears to be sponsored by GridStore.
Despite taking substantial criticism from many IT practitioners in a number of forums, George has stubbornly defended his statements, encouraging those who object to offer up a "professional response".
I find myself doing so reluctantly: weighing the need to correct many of George's and Colm's erroneous statements vs. giving unwarranted attention where none is deserved. All of my responses are based on widely published information; a simple Google search can disprove many of the assertions.
And, to be fair, I know many independent analysts who do very good work on behalf of their vendor clients.
Unfortunately, this isn't an example of that.
What You Need To Know About VSAN
If you're a competitive storage vendor, VMware's Virtual SAN can be a scary product. It checks all the boxes in a unique way that's very hard to directly compete with.
Thanks to the widespread success of vSphere, it also has a built-in audience of virtualization admins who can uniquely appreciate what it brings to the table.
By comparison, most storage professionals look at the world through an established lens, so -- in general -- it takes them a few ticks longer to appreciate what VSAN is all about.
If you as a vendor sell external storage, you're concerned about server-based storage displacing external storage arrays. If you're one of the newer "hyperconverged" appliance vendors, you're concerned about vSphere + VSAN seriously eroding your nascent market opportunity. And if your storage product is entirely software-based, you're concerned about VSAN's deep integration with the hypervisor, and all the advantages that brings.
Having been in this storage business for twenty years now (!) I tend to measure the impact of a new product by the visceral nature of the competitive reaction it evokes.
And, when it comes to VSAN, I have not been disappointed ...
The Basis Of The Storage Swiss "Argument"
While the piece starts out by acknowledging much of the appeal of VSAN (the obligatory "head nodding" portion), it quickly gets to its core (incorrect) assertion: that the ratio between compute and storage is limited with VSAN. Thus, all manner of problems will potentially arise when attempting to scale storage.
Frustratingly, the truth is quite the opposite.
Let's begin with the easily verifiable facts, starting with capacity. A given VSAN server may have as many as five (5) disk groups, each with seven (7) disk drives, for a total of 35 disk drives per server. The size of a VSAN cluster ranges from 3-32 servers, which may be single socket servers if you choose -- keeping in mind that with VSAN, all servers are used to provide both compute and storage services.
Using 4TB drives, that yields 140TB of raw capacity per single-socket server. Needless to say, as VSAN is licensed per socket, this would be an impressively cost-effective capacity-oriented configuration. While it is true that some specialized environments might want more capacity per server socket (e.g. enormous video archives), I think this good enough to cover off >95% of real-world use cases.
Not to pile on, but in a maximal 32 node config, this maximum would rise to 4.4 petabytes, and an even more-impressive 6.7 PB when the new 6TB drives become more widely available.
I think most reasonable people would agree that VSAN's storage capacity scales rather independently of server capacity.
Just for fun, let's go the other way, and scale compute and IOPS with a minimum of capacity.
Using a four-socket server design, a maximal VSAN cluster would have 128 sockets and support as many as 160 high-end PCI-e flash cards, each with a small disk drive behind it. A pricey rig, but -- hey! -- you could build one if you wanted to. And I would guess it to likely deliver far greater performance than the 2M IOPS announced in March -- that test used a much more modest configuration :)
Here's the point: with VSAN, compute and IOPS can scale quite independently of capacity as well. Better yet -- VSAN clusters do not need to be homogeneous, although limiting rampant diversity will certainly make your life easier: any mix of HCL-listed components is technically acceptable.
After all, it's software.
From Here, It Gets Worse
The article claims that adding capacity with a server-side storage solution (such as VSAN) may force disruptive upgrades when additional capacity is needed.
Once again, not at all true.
VM administrators are quite comfortable with putting individual servers in maintenance mode, and vMotioning as needed -- and have been doing so for many years. With VSAN, a simple policy setting will delay a reprotection rebuild if all you want to do is add a few disks and your server doesn't support a hot-add.
A few paragraphs later, the article states that -- as additional storage capacity is needed -- new servers will have to be "racked and stacked" simply to gain more capacity. Not exactly true. If your existing server enclosures have room for more storage media, that's where it goes. VSAN supports hot-add if the server does. If you're out of places to put things, a new server comes in, or you replace small capacity drives with larger ones.
Let's not forget the obvious comparison with external storage arrays -- it sort of works the same way. When you need more capacity, you bring in more hardware. Except with VSAN, you're working with one flavor of tin; not two or more.
From there, a vague assertion is made that VSAN requires three copies of data for protection, therefore you'll inevitably be growing servers at 9x rate using server-side storage. No, I don't know how they got to 3 copies required, nor is it clear how one gets from there to a rather preposterous 9x server growth claim.
The published facts are quite different.
Unlike external storage arrays, VSAN implements per-VMDK data protection -- it is quite unique in that regard. Storage objects that aren't all that important can have a single, unprotected instance if you choose. Important data should have at least two copies. And, if you're extremely conservative, three copies are supported as well.
All on a per-VMDK basis, entirely policy-driven and easily adjusted simply by changing a few settings. To make things even easier, just take the defaults and go.
Try that on an external array.
This is followed by another strange assertion: that since VSAN does not need a separate, external storage network, therefore "network complexity is increased". Huh. VSAN uses the exact same network (and the same network configuration and management tools) as other cluster-wide services. Once again, the truth appears to be precisely the opposite of what is claimed.
And It's Not Either-Or
One of the things I find myself reminding folks about is that it's not an either-or proposition when it comes to VSAN -- feel free to use external storage alongside server-side storage. You'll find that vSphere's policy-based mechanisms make it relatively straightforward to place new workloads where you want them to go, and Storage vMotion them back and forth as needed.
Now, A Word From Our Sponsor!
From there, the article pivots into attempting to make a case for GridStore's product: SLA-driven provisioning. pay-as-you-grow, and more. While it's easy to deconstruct many of the points made here, that's not my goal. [Update: I have since discovered that GridStore's product isn't even on the vSphere HCL, so their motivations in funding an attack are not exactly clear]
I have no qualms with any vendor emphasizing their strong points -- and paying an analyst to do so -- however, in this particular case though, a detailed comparison would show that VSAN does a far better job in achieving these stated goals.
Where Does That Leave Us?
At some level, everyone has to make a living, including "independent" analysts. I'm certainly OK with that. And, as before, there are some great ones I know who do a great service to the community.
What I'm not OK with is polluting the commons with flatly incorrect assertions that fly in the face of widely-published documentation and real-world experience.
Someone either didn't do their homework, or didn't care about being accurate.
Many IT pros will simply dismiss the article as another piece of random flotsam on the internet. That's encouraging. The theoretical worst case would be an unfortunate IT pro who takes this article at face value, and later finds out just how greatly they were misled.
Caveat emptor, baby.
Chuck,
While I wouldn't say that VSAN REQUIRES 3 copies of data I would say that in a distributed mirror architecture using standard servers like VSAN, or most object stores for that matter, that I wouldn't sleep well at night with less than 3 copies.
When that VMware admin takes a node offline to add capacity, as you suggest they can, if they took the defaults there would be just 1 copy of each VM on 1 disk drive. Should that drive fail at the wrong moment the poor admin would be faced with data loss.
To get resiliency comparable to a dual controller array with RAID 3 total copies would be required.
I would also note that per-VM data services are not unique to VSAN or even ServerSANs. Tintri for one has per-VM data services at lower cost in performance or capacity than VSAN. I hear some others do as well including Isilon.
I, as you well know, also work for storage vendors as an analyst. Tintri has loaned me a storage system but has not, yet hopefully, engaged my services.
- Howard
Posted by: DeepStorageNet | April 08, 2014 at 01:30 PM
Hi Howard -- yes and no.
Just to restate, the default is to have two copies on separate separate server nodes, but this can be increased or decreased on a per-VMDK basis. But let's assume 2 copies for now.
You are right in that if I take a server down *and* my second copy failed *and* I didn't reprotect the original storage objects before powering down the server, *and* if I couldn't bring back the first server I took down, *then* I might experience data loss.
But that's a choice, not an inherent limitation. I would also have the option of relocating the storage objects prior to powerdown, preserving two copies despite the missing server. Or, as you point out, establishing three copies on a per-VMDK basis as needed.
I'd have to balance the resources required for the automatic move against the need: the tantalizingly small probability that -- for some reason -- my server didn't come back up *and* my second copy failed.
And, then again, there are hot-add servers.
The good news here is that VSAN isn't forcing you to do things one way or another. Certainly, there will be people who feel as you do that 3-way protection is the way to go for everything. Others will be more selective in their approach, balancing cost and risk.
Thanks!
-- Chuck
Posted by: Chuck Hollis | April 08, 2014 at 01:58 PM
Howard, 3 nodes (2 data + 1 witness) is the minimum requirement, however I suspect you're following the line of VMware's bloggers with the logic posted here - http://www.yellow-bricks.com/2013/10/24/4-minimum-number-hosts-vsan-ask/ - which makes sense. After all, server maintenance now means reduced storage resilience, which needs to be catered for. Whether done externally or in the server, there's still a price to pay for data protection.
Posted by: Chris Evans | April 08, 2014 at 04:23 PM
Chuck,
I think we're in agreement, and for applications where a 1-2% chance of data loss is OK and there are many, 2 copies, and the current VMware admin processes, would be fine. Therefore I appreciate the choice.
For most of my apps I'd say 3 copies, for some 2 copies and modified process to migrate data off before power down.
As a Steely-eyed Storage Guy paranoia is my job.
Posted by: DeepStorageNet | April 08, 2014 at 05:25 PM
Thanks for the professional response. Frankly, I have a few issues with your criticism of the article so I have to ask for your patience as I construct my equally professional response to your blog. I must say, however, that I find it a bit perplexing that you took umbrage about the lack of disclaimers in our article; despite the fact that it was clearly listed at the bottom of the piece.
Furthermore,to say that I have been stubborn in my defense in the light of "substantial criticism in a number of forums" is at best, artistic license. Interestingly, the ONE forum you link to has had as much support for what we had to say as it did for defending the virtues of the VSAN offering. In fact, the public and private comments to our site have been roughly equally divided between proponents and detractors of VSAN. Even the recent comments on your own forum show that the verdict is still out on the VSAN solution. Again, a more detailed response will be forthcoming on our our part shortly.
That said, I do appreciate the work that must have gone into your response.
Posted by: George Crump | April 08, 2014 at 08:48 PM
The writing is on the wall for pay to play. On the component side, not only would pay for play not fly, end users would revolt against the content. It doesn't matter how catchy the name is or what it represents. You simply cannot be independent while pointing the finger with the check to do so in the other hand.
Independent reviews and editorials are coming to this market. Unlike the pay to play guys who write mainly for industry insiders, the 'review sites' write for the decision makers. The content is the key. Too many times I've seen single or two page previews and laugh while wondering who would write a check based on the same information you get on the manufacture's product page.
Here's our initial take on vSAN in news form: http://www.tweaktown.com/articles/6140/a-major-shift-in-the-data-storage-market-is-on-the-horizon/index.html
Posted by: Chris Ramseyer | April 09, 2014 at 06:22 PM
Chuck -
It is unfortunate that this is how the industry works today. However, I am surprised to hear this statement coming from EMC/VMware considering there is not an analyst group you have not paid and this is the industry largely created by EMC. Its both amusing and hypocritical.
On the article itself - the content is 99% about the architecture of server-side storage - and there are various flavors of this. VSAN is the most recent of many and obviously topical at the moment. It’s also fair to say that not all arguments made apply to VSAN, and it is clearly not worded that way.
For a more technical response, please see my blog at Gridstore.
Posted by: Kelly Murphy | April 10, 2014 at 05:02 PM
Hi Kelly
I understand you're very passionate about your product, something I can certainly empathize with. I've certainly been there before :)
Just to make sure we're being accurate here, I work for VMware, but I write my blog independently. My content is neither reviewed nor approved by VMware.
To the best of my knowledge, neither EMC nor VMware have ever paid an analyst to intentionally fabricate facts. While you might want to hide behind the fig leaf of "well, we weren't really talking about VSAN", I would direct you to the title of the published article.
Look Kelly, I'm sure you're a good guy with a lot to say -- but you went down a dark path on this last escapade.
I'm more than willing to discuss the pros and cons of different architectures -- that's fun for me, and useful for everyone -- but paying hired guns to do your dirty work isn't exactly kosher.
Server-side storage is a big deal in the industry, and deserves a lengthy exposition. Coming from the array side of the business (EMC), I see both sides, and am quite comfortable discussing the pros and cons of each approach.
Let's look forward to a productive discussion, shall we?
-- Chuck
Posted by: Chuck Hollis | April 10, 2014 at 11:14 PM
Chuck
I appreciate that you posted my response on your blog and your willingness to discuss the technical merits of each approach - you are right - that would be a good thing for everyone.
There are similarities between Gridstore an VSAN - both are built specifically for virtualization and we both operate in the kernel of the host - many of the advantages we both offer come from working in the kernel - close to the work load and being able to provide storage that is optimized per VM. There are also significant differences.
I would love nothing more than to do a head to head benchmark of the two architectures, same workloads, same platforms - and let the results speak for themselves. Unfortunately we do not operate on ESXi because VMware is not open to allow us to work in the kernel like we have done successfully on Micorosfts Hyper-V for the past year.
I would however like to take you up on your offer to openly discuss the two architectures. That would be useful and productive for everyone.
Let's discuss where and how we can do this.
Kelly
Posted by: Kelly Murphy | April 14, 2014 at 12:04 PM