Ours is an industry that freely defines new concepts from time to time. And, occasionally, an existing concept morphs and takes on an entirely new meaning.
I think we might be seeing that happening right about now.
So, What's This All About?
If you've been in the storage industry for a while, you know what storage virtualization is all about. It's the idea of logically pooling multiple storage arrays so they appear as one.
Different vendors have implemented the concept very differently, e.g. it's hard not to note how EMC, IBM and HDS have all done things very differently. And the underlying value proposition varies considerably from vendor to vendor as well.
But this post is not about what storage virtualization *was*.
This post is about what storage virtualization *will be* going forward.
Once Again, Virtualization Changes Everything
Virtualization in the sense of VMware, that is. The starting point for this discussion is VMware's announcement of their vStorage initiative, which can be loosely thought of as a variety of ways for storage vendors to play better in VMware environments.
And, of course, EMC's corresponding press release that basically says "we're all over this one, folks".
At a high level, I see three concepts:
- vStorage APIs for things like storage management, backup, and integration with Site Recovery Manager (see earlier post on SRM). All goodness here -- VMware becomes an integrated point-of-control for many (but not all) storage functions.
- vStorage Virtual Appliances -- simply, storage functionality delivered as a virtual machine, rather than a hunk of hardware. EMC's Avamar, for example, does this today.
- vStorage functionality, e.g. thin provisioning and linked clone
So, here's the question I'd like to pose: will the new meaning of "storage virtualization" going to be "how well do your storage products integrate into a VMware virtualized environment?"
Because I think that's where it's going to end up before too long.
A Primer On Storage Functionality
Not to oversimplify, but storage functionality can live in one of three places: on the array, in the network, or on the server. Take any arbitrary storage function (replication, archiving, provisioning, encryption, etc.) and one could play devil's advocate about arguing use cases to live in all three places.
One could characterize the first wave of storage functionality as "array based". One could reasonably argue that -- at the time -- there was such rampant diversity in operating systems, databases, etc. that it was perhaps one of the only places you could standardize certain functionality.
And, going back a few years, there seems to be a second wave of functionality as "network based". Storage networks had become standard enough -- and interoperable enough -- that, yes, it could serve as a locus point for storage functionality. In terms of the EMC portfolio, think RecoverPoint, Invista and a few others.
Are we seeing the beginning of the third wave of storage functionality as "server based"? Presumably in a VMware server environment?
I would argue "yes" to that last statement.
The Secret Is Management Paradigms
When most storage functionality was delivered as array-based features, you needed to create a whole class of administrators and tools that -- well -- managed storage and what it did. Most of it could be described as "out-of-band" -- other than driving clones and snaps, very little could be driven from the server. This model was extended a bit to include functionality that ran as part of the storage network over time.
But, when considering a VMware environment, you've got the potential for an entirely different management paradigm. You think in terms of application containers -- where they're coming from, what's inside them, what's the envelope they might need in terms of service delivery and availability.
And it makes a certain logical sense to extend this paradigm to control what storage can do as well.
Want to make a clone of a virtual machine? Makes more sense to invoke that as part of the VMware environment rather than to fire up some special vendor-supplied utility.
Want to change service levels of a running application? Sure, vCenter can do that for you, but wouldn't it be great if it told the storage environment that the desired service level was changing, and to react appropriately?
Indeed, we've seen this effect already with the rapid success of SRM -- Site Recovery Manager. Since it speaks in terms of (virtualized) applications and their relationships, it makes a far more powerful management paradigm than trying to do this at the storage or storage network layer.
Indeed, no matter how simple storage vendors make their tools, it'll never be as simple as if it's just a natural extension of the operating environment, presumably a VMware environment. Yes, there will be exceptions and corner cases, but it's hard to argue against the trend.
Towards A New Meaning Of Storage Virtualization?
If you are willing to accept the premise that perhaps the new meaning of this term will be "storage in a fully virtualized environment", what might we see in the future?
First, we'll see storage vendors offering certain functionality in both physical form, as well as containerized form. Both will have their pros and cons, but -- ultimately -- customers will have more choices in this regard. EMC's Avamar comes to mind, as an example.
Second, I'm expecting to see a flurry of activity around API integration. SRM was just the opening salvo, I believe. As an example, you'll see that EMC is now demoing PowerPath running native ESX as just one example.
Going farther, I can see all sorts of "assists" being integrated for things like snaps and clones. Or resetting service levels up or down. Or perhaps moving virtualized workloads between arrays without disruption.
And third, I'm expecting that interest in remote replication technologies will see a marked resurgence -- simply because there will be strong interest in vMotion-ing between different data centers, or participants in the vCloud.
All exciting stuff, to be sure.
From Applications To Servers To Storage
What we might be seeing here is an industry consolidation on a standardized computing environment: VMware. And, because of this, there's new interest in having applications and servers orchestrate storage functionality, and not the other way around.
And when most of this storage functionality can either be implemented, or orchestrated, from the OS level, what happens next?
Personally, I was a bit tired of the traditional storage virtualization discussion. In many cases, I felt it had become an answer looking for a problem. Dig back to my posts from a year or more ago, and you'll get a flavor for why I had become somewhat cynical on all of this.
Much more interesting to me?
How storage -- and storage functionality -- will play and integrate in this new world being drawn by VMware.
I think we've got a new candidate meaning for "storage virtualization" going forward.
What do you think?
Courteous comments always welcome!
Chuck,
dare I suggest that the reason that you got tired of the traditional Storage Virtualisation debate is that EMC's product set has not really been there? The message from EMC with regards to storage virtualisation has been muddled and confused and that is being generous to be honest. I hear alot of hot air about various other people's products being the wrong approach; doing it the easy way etc and EMC wanted to make
sure they do it right, everything about how bad everyone else's product is whilst not coming up with an alternative. It kind of reminds me of the old EMC attitude to RAID-5; if you go RAID-5, the world will end, your data will magically corrupt or not be available. Of course, now EMC have a solid RAID-5 implementation on the Sym; the message has changed.
The recent presentations I've had on Invista still leaves me feeling that EMC miss the point and also that EMC focussing so much on what they believe is the right approach that they will never ship anything useful to me.
So instead of shipping me a useful tool, lets move the debate onto something else where EMC feel comfortable? Sorry if this seems a bit harsh but Storage Virtualisation is the one area I feel that EMC have lost their way big-time and trying to move the debate is a tacit admission of it.
But back to your article; it is entirely predicated on vmware completely dominating the data-centre. Will this happen? Maybe? Will it happen quickly, probably not. And hey, none of what we are talking about is especially new; how much of this could done today using that archaic monstrosity called a mainframe? These are not new paradigms, we may paint them a new colour and call them a new name but it really is a case of back to the future.
Posted by: Martin G | September 16, 2008 at 05:01 AM
Hi Martin G
You're welcome to your opinions on EMC's take on traditional storage virtualization, but I would disagree.
The RAID 5 on Symmetrix example from many years ago that you offered was driven by a particular individual who is now working for IBM :-) You do have a long memory, don't you?
As far as VMware becoming dominant enough to drive standardization of functionality in certain environments, I would offer that it looks like that's already happened in many customer environments. The question remains whether or not this trend will continue.
And, yes, your comments regarding similarities to mainframes is right on.
Thanks for writing!
Posted by: Chuck Hollis | September 16, 2008 at 08:32 AM
Chuck,
I have to agree with MartinG on this one, a failed EMC implementation of what is currently known as storage virtualization doesn't mean that we all failed. We have many many very happy customers, more than 130 references (pretty close to the same number of customers you have with Invista according to your last figures) and an ever increasin demand / uptake.
You say "I disagree" - how ? with what evidence? or as our NetApp friends have suggested are you waiting for your ghost writers to give you the facts? ;)
Since DataCore have supported their software in VM's for some years - is this really anything new - and has that taken over the world?
How does this work with a global namespace? Can you cluster multiple VM's across multiple different hosts? Whats the locking / control point for clustered access to storage? All just questons, and I'm sure there are answers if this is really intended to provide a true SAN replacement. Off to go and read up more.
Barry
Posted by: Barry Whyte | September 16, 2008 at 02:34 PM
Sorry, the point of my post was pretty simple: are we evolving to a new definition of storage virtualization?
Feel free to debate that issue in the comments.
To your points, global namespaces (e.g. Rainfinity and similar) work pretty much the same as the do in the physical world at both the guest and ESX levels.
Cluster coordination is provided by ESX. I am no expert, but I think they use an in-memory model, without having to resort to on-disk locks, hence agnostic to the multiple array question. EMC's VirtualGeek would probably be better at answering this than I would.
Yes, certain vendors have been putting storage functionality in VMs for a while (LeftHand comes to mind as well). We understand that their businesses is doing well.
The point is more driven by wide VMware adoption as a datacenter standard, hence more recent. The widespread standardization creates the opportunity for more focused investment by all storage vendors.
The point is subtle, but important. The last time in the industry we had this degree of standardization was MVS. And, as a result, the host ended up providing a fair degree of storage functionality.
As an example, the concept of "storage virtualization" as we understand it in FC world doesn't really have a parallel in the zOS world, does it?
Will we see this again if VMware continues to become the datacenter standard?
BTW, taking random potshots is not encouraged :-)
Posted by: Chuck Hollis | September 16, 2008 at 02:37 PM
Interesting stuff, Chuck.
"He who delivers the functionality, gets the check."
Sun called storage a "feature of the server". Oracle made the claim that intelligent storage was no longer needed when they rolled out ASM. Microsoft champions DAS for Exchange. Now VMWare is in the storage functionality business. They all smell revenue and it makes sense that they should go after their share.
Is this a paradigm shift? No.
Is is disruptive technology from the point of view of a knuckle-dragger like myself who sells.....GASP...storage arrays? Absolutely. The co-opetition we face continues. So we embrace and extend and compete where necessary.
VMWare still has to prove that they are not Netscape Navigator. Then they can get in line with the rest of the folks predicting the death of intelligent storage.
Posted by: CONTRARiiON | September 17, 2008 at 01:36 PM
Hi Chuck,
Nice post. I think "virtualized infrastructure based" rather than "server based" is a bit more accurate as others have pointed out that many companies have provided "server based" storage services for many years. I'll also add two points. First that providing functionality at all levels makes sense (the right balance being the state of the technology and the business need) and that the better they are integrated into a holistic infrastructure the greater value they deliver. Second, that functionality based on real-time measurements, feedback loops, and policies are key to achieving effective automation. I'd recommend a short but good read on just how effective nature's feedback loops can be... "The Body Has a Mind of Its Own: How Body Maps in Your Brain Help You Do (Almost) Everything Better" by Sandra Blakeslee and Matthew Blakeslee.
Mike
Posted by: Mike Dutch | September 17, 2008 at 03:43 PM
Great thoughts as always, Mike!
You should be blogging yourself :-)
-- Chuck
Posted by: Chuck Hollis | September 17, 2008 at 05:06 PM
Chuck - answering your question, cluster coordination in VMware Infrastructure is handled at a couple of levels.
There is some use of SCSI locks during metadata updates (like creation of a VM, extension of the VMFS), and some use of file-locks as a basic split-brain stopper. Cluster-wide coordination is done in a neat, but weird way.
Each ESX server is nominated as a primary or a secondary node, and they all literally talk to each other via the VM HA agent (AAM service). They have a normal response behavior (try to start VMs if you lose contact with the ESX server hosting it - and if there is a file lock, chill), and an isolation response (when they can't reach their isolation IP addresses - default to vmkernel gateway, but can be multiple IPs for hardening - this response can be changed - in 3.5u1 and earlier it's "shutdown your VMs", in u2 and later, it's "keep the VMs running").
Virtual Center actually only plays a role during initial VM HA/DRS rules based-placement, after that, VM HA does fine without VC (though VC is required for the feature - it's only intial setup/licensing, not ongoing coordination).
A couple technical corrections to earlier comments:
1) it's a given that with VMware that you can cluster VMs across multiple hosts. That is, after all the definition of an ESX cluster, with any shared storage (using VMFS or NFS datastores). That's nothing new, and isn't changed by any storage virtualization - which leads me to my quick conclusion that once again, it's storage vendors ignorant of VMware fundamentals creating affinity where none exist.
2) Geographically dispersed clustering which IS something that's provided by some Storage virtualization offerings (including Invista), but actually doesn't REQUIRE storage virtualization, it only requires a read/write remote replica of the VMFS/NFS datastore, and not definitionally even a synchronous replica (but personally I think an async replica is not a good idea in this use case). BUT - while technically possible, it's usally a very bad idea, and invariably the case of a vendor pushing something because they think it differentiates them. The reasons it is a bad idea has nothing to do with storage virtualization, and everything to do with VMware itself. I constantly am fighting this in the field, and invariably it's a case of VMware-ignorant sales teams pushing a feature on a customer. This is something I did a long post on here: http://virtualgeek.typepad.com/virtual_geek/2008/06/the-case-for-an.html
Goodness, I hate the term Storage Virtualization, not because it isn't valuable as a feature (and EMC haters out there, it's not because I feel insecure about Invista - Invista is just fine, and we're selling it fine, thanks very much), but because it can be applied to many things that are TOTALLY not related.
- Every Array since the mid-1990's meets the general definitions of virtualization:
a) abstraction of resources from the physical configuration to their applied use case (i.e. LUN is made of MetaLUN compents on a RAID group which is sliced and diced; or file is on FlexVol which is on an Aggregate which is on a RAID construct; or a LUN is a set of pages in a thin-provisioned pool across a broad set of RAID objects - EVERY vendor does this differently, and direct comparison requires a PhD, and a few years playing with each platform.
b) reconfiguration of the physical resources without disruption to the use case (vendors who don't have a way to add IOPs non-disruptively, or add capacity non-distruptively, or change RAID types non-disruptively are sadly behind the times)
- The new heterogenous storage virtualization technologies (IBM SVC, HDS, Datacore, Falconstor, NetApp vFilers, EMC Invista, etc) add the ability to do that ACROSS heterogenous physical arrays... BUT they don't provide the hardware independence that VMware does, since you always are sticking another vendor in the middle. They do add - in my experience one important use case:
a) transparent data migration across arrays. For large, heterogenous customers - this can be a very, very useful thing. For 90% of customers that have one array at each datacenter, man - it makes things worse with little upside. For most of these customers with 20TB or less of storage, Storage vMotion is likely the best, simplest answer :-)
But, the benefit of rapid, transparent datamobility does in general come at the sacrifice of the characteristics of the arrays front-ended by the storage virtualization device. Every virtualization appliance has upsides (i.e. put a vFiler in front of your DMX and it gets production dedupe!) and downsides (i.e. put a vFiler in front of your DMX, and it inherits EVERY characteristic of the vFiler!). It's that strength/weakness thing I continually struggle with - everyone drinks their own kool-aid so hard, or is so insecure, that they can't acknowledge any downsides.
IMHO, Every customer needs to look at those gains/losses and determine if the benefit of transparent data migration is worth it.
BUT - I really, really hate when vendors (us included) conflate Storage Virtualization with VMware/Hyper-V/Xen/etc. The connection - well - just ain't there.
It's a case of not putting the customer first, IMHO.
Posted by: Chad Sakac | September 20, 2008 at 05:17 PM
Hey Chad, I can't help but totally agree with your comments about the mis-use of the term 'Virtualisation' with regards to storage, I've been telling people that we've been doing virtualisation in storage for years.
I've been trying to convince people that what they mean by Storage Virtualisation is actually 'Storage Federation with bells and whistles' i.e turning a bunch of different arrays into one big federated pool of storage which indeed inherits a lot of it's characteristics from the controlling appliance.
Posted by: Maritn G | September 24, 2008 at 03:49 PM