Like anything relatively new, it will take a while for people to fully understand the rationale and the strategy behind the product. It took me a good while before I got a full grasp on the implications of this new technology.
But I can be slow to understand new concepts ... maybe you'll do better!
Building The Really Big Resource Pool
Let's oversimplify -- to get better utilization and responsiveness from IT infrastructure, we all want to create big pools of resources.
Call it a cloud, call it whatever you want -- the goal is to create a dynamic, liquid pool of resources that all can use. VMware is doing this for virtual servers. And EMC is doing this for virtual storage.
We want to use these resource pools to do all sorts of cool things -- increase utilization, have extra headroom if needed, load balance across all resources, make migrations easier, even come up with entirely new ways to provide business continuity.
The bigger the pool, the better. The more hardware-agnostic, the better. And if we can pool resources in multiple locations, even better -- especially when we consider using external service providers.
VPLEX is a new platform to build these new storage pools -- big, hardware-agnostic storage pools that can stretch over distances.
These newer storage pools will help drive up utilization, be more responsive to new resource demands, load balance across available resources, make migrations far easier, and support entirely new models for business continuity.
We use the term "storage federation" to describe this pooling of storage resources. Simply put, this is big stuff.
Thinking Out Of The Box
OK, lots of new capabilities here. One might wonder -- well, why didn't EMC just make this a feature of the storage array?
Well, technologically speaking, there's no good reason why we couldn't do this at some point -- but, in talking to customers, they were pretty adamant that they wanted to build their resource pools from the storage they already had in place.
Sometimes it'd be EMC storage of different flavors, sometimes not. Hence the need for an external device that sits between storage and server.
In one sense, there's a pretty strong argument that VPLEX is an entirely new storage platform -- it delivers a new class of storage services (storage federation) that weren't widely available before.
VPLEX does this by creating a new abstraction layer over existing storage devices -- you still see LUNs, but their physical location is now dynamic -- much in the same way that a VMware server farm will put the right application on the right server at the right time.
Use Cases?
There's a broad range of use cases for VPLEX, spanning the range of "better ways to do things you're doing today" all the way to "entirely new things to consider".
If you're interested in classic storage virtualization capabilities (SVC, USP-V, et. al.) you'll find a fresh approach with a broader set of architectural capabilities than the alternatives. And, yes, I'm sure we'll be going head-to-head on things like scalability, availability, manageability, etc. -- and come out favorably.
But if you think this is just about a better classic storage virtualization platform, I'd argue you're missing the point.
First, using VPLEX, you've got a much better stretched cluster approach between two data centers. Rather than a brute-force approach of stretching each and every SAN connection over a distance, we now have what looks to be a "single LUN" with some very smart dynamic sychronization behind it.
And we're all enamored with the idea of, say, a VMware cluster that load balances and fails over intelligently between two data centers. Not to mention file systems, databases and the applications that run on them. Or perhaps pooling together multiple VBlocks into a single, geographically-dispersed cluster.
Distances will grow over time -- 100km today, asynchronous distances before too long. And, farther out, the ability to support N-way configurations rather than pairs of locations.
Do People Want This?
Well, we've talked to enough customers in enough gory detail to convince ourselves that -- yes -- there is a significant market that wants this sort of functionality -- especially when distance comes into play.
Part of the challenge is human nature -- we tend to evaluate new things in the context of things we already know. For example, is the iPad simply a better netbook? Or something entirely new?
Interest generally falls into three buckets: (1) customers with multiple data centers who'd like a new model for exploiting resources at distance, (2) customers who'd like to make increased use of external service providers, and -- of course -- (3) customers who are looking for a better form of "traditional" storage virtualization.
A New Form Of Replication To Consider -- Access AnywhereIf you're in the storage business, you know we have a complicated taxonomy of replication types. Synchronous and asynchronous. Continuous and point-in-time. Add in multi-hop, consistency groups, bunkers, etc. -- there are lots of different animals in the zoo.
Well, now there's a new one -- global federation -- one that doesn't presume the usual target/source type of arrangement. Information can appear to be in two or more places at the same time -- independently of where it actually might live.
And it's going to take a while for us to collectively get our heads to accept that "where the storage is accessed" isn't necessarily the same as "where the storage is physically located.
What To Expect Next
Now for some safe predictions.
First, the HDS and IBM crew will undoubtedly try to be as negative about the product (and EMC) as is humanly possible. I would expect nothing less. As long as they stick to the facts, though, it should be interesting to watch the discussion unfold.
Second, there will be a predictable technologist reaction against the perceptions of more complexity, lock-in, etc. Hard to avoid that reaction anytime you invent something new, or introduce a new layer in the stack. But smart IT decision makers always weigh the pros and cons, and do what's best for the business.
Third, we're going to be really, really busy explaining this new technology to enterprise IT organizations and service providers. That's been going on for a while ...
Many of these people immediately recognize that the ability to federate storage can now potentially change a lot of large-scale assumptions -- how many data centers you build, what roles they play, how you think about delivering IT globally.
It's weighty stuff, indeed.
Who said storage was boring?
:-)
--------------------------------
Resources and additional links (will be updated over time, so please check back)
- A nice case study from AOL, along with a video.
- A nice video from Melbourne IT, one of our service provider partners.
- An evaluation report from our good friends at ESG.
- A short slide deck, although I think it's a bit outdated in a few areas ...
- A nice set of alliance partner video clips from Brocade, Cisco, Intel, Microsoft, Oracle and of course VMware.
- A wonderful solution brief showing the environment behind VPLEX moving Oracle, SQLserver and Exchange dynamically at a distance
- And, of course, the master launch page on EMC's web site here.
Chuck,
Great write up on VPLEX. It looks really really interesting!! From a hardware-agnostic storage point of view what sort of feature overlap would you see between this and maybe a CX or DMX solution? Does VPLEX require the underlying storage systems to replicate or snapshot or does it handle all that itself? What about thin-provisioning, can it do that natively or does it utilize the underlying storage array’s feature?
@StorageTexan
Posted by: Storagetexan | May 10, 2010 at 10:01 PM
Lots of information to digest...let the feast begin.
Posted by: David A. Chapa | May 10, 2010 at 10:41 PM
@StorageTexan,
To give you a quick answer to your question..
The intent of VPlex is to do what traditional storage virtualization platforms from other vendors don't, which is to allow you to leverage the technologies inherent in your existing storage platforms - Symmetrix SRDF/Timefinder or Clariion SnapView/MirrorView, for example. Since Vplex is array aware and preserves the array's cache functionality, while enhancing performance with it's own cache, you can still use those tools. That is something you can't do with SVC and USP-V, with those products you are forced to move all of the intelligence into the virtualization layer which may not work for all of your applications. This is a differentiator for VPlex, not to mention VPlex Metro federation which is entirely unique.
Posted by: Storagesavvy | May 11, 2010 at 07:35 PM
Ahhh so this is like a RAMSAN. Cache on the front end, and then let the storage controllers do the rest. I get it.
Posted by: Storagetexan | May 11, 2010 at 10:07 PM
Storagetexan -- nope, not really.
-- Chuck
Posted by: Chuck Hollis | May 11, 2010 at 11:15 PM
Hi, many times by EMC I've been told that one shouldn't use virtualization from IBM or HDS as they sit on the front of the storage, with their own cache, and they damage the performance. How is this different from those performance-wise?
Remember, on argument was that cache-in-front destroys the nice caching algorithms used by symmetrix.
Posted by: soikki | May 12, 2010 at 02:41 PM
Hi Soikki -- good question.
One of the things we've attempted to do with VPLEX is to leverage and exploit array capabilities, and not replace them. Many of these early attempts had an implied model of "dumb storage, smart appliance" where we worked towards a "smart storage, smart appliance" model.
The "smart" thing that VPLEX does is pooling and caching at a distance, which I believe is an incremental capability to what arrays do, and don't attempt to replace what an array does today.
So, you won't find things like thin provisioning, FAST, dedupe, snaps, etc. in the VPLEX product -- arrays do that. Similarly, you won't find storage federation (at a distance, presumably heterogenous) in an array product.
Good question, though ...
-- Chuck
Posted by: Chuck Hollis | May 12, 2010 at 03:14 PM
Hey there old boy,
Can I put this in front of an IBM XiV storage array and net all the benefits of VPlex?
Posted by: James Hindman | May 12, 2010 at 09:36 PM
Don't know if XIV has been qualified yet, but -- technically -- yes. Why you'd want to do that isn't clear to me, though ...
-- Chuck
Posted by: Chuck Hollis | May 12, 2010 at 11:25 PM
The question about performance impact is still open, it would be very nice to get some clarification. Thanks... :)
Posted by: soikki | May 14, 2010 at 01:09 AM
Soikki
There's no way to answer that in a general sense, and incredibly dependent on your I/O profile. Offering up a standard answer is a bad practice in general.
For example, a local read cache hit would probably be a "win" performance-wise. A remote cache hit might be a win, or not, depending on the distance and associated latency.
How many cacheable read hits will you get? We'd have to look at your app mix.
As I understand it, writes are passed through, and thus would be highly dependent on the underlying array's capabilities, which might be an EMC, or not, as the case may be.
You're right, anything in the data path adds a very small bit of turn-around time to I/Os passing through (vs. cached). I don't know what the exact number is, but -- given that we use the same hardware design in the VMAX, I'm not too concerned about turnaround time for I/Os that flow through the VPLEX.
I don't know if you saw Brian Gallagher's keynote, but one of the many important points he made was the incredible amount of testing we do on our products using both live workloads and replayed traces from our customers' environments.
So, if this is really interesting to you, let me know, and I can have someone more knowledgeable than I contact you.
-- Chuck
Posted by: Chuck Hollis | May 14, 2010 at 08:09 AM
How can VPLEX be leveraged for services such as VMotion when it is difficult or not possible to stretch the VLAN 100km to geo distances?
Posted by: DMyers_ | May 19, 2010 at 10:24 AM
Hi DMyers
The version shipping today has a 100km limit, as you point out. We were also very clear that -- before too long -- the same capabilities would be offered at unlimited asynchronous distances.
The technology already does that, we just have a lot of validation work to do before unleashing this on our customers.
We also said that, after that, we'd be offering N-way clustering at a distance as well.
So, if you've got a short term need to go longer than 100km, we'd probably go with a traditional replicate-and-failover approach (using SRM perhaps).
Architecturally, there's a strong point of view that this model will quickly be supplanted by a geocaching approach as found in VPLEX.
If you're really interested, contact one of the EMC tech specialists, and they'll take you through the timelines and relative pros and cons of each approach.
-- Chuck
Posted by: Chuck Hollis | May 19, 2010 at 11:36 AM
Hi Chuck,
I was in EMC World last week with some customers. One of them is interesting in the V-Plex scalability. In fact, most of presentations shows stretch luns between 2 V-Plex Clusters but what about the scalability of one V-Plex cluster ? I know that it starts with 1 engine to a 4 engines maximum, and after ? are we able to connect 2 V-Plex cluster together on a same DataCenter ? (is it what we call a "local federation" ?)
Tks for your answer and see you in France in June.
Christian
Posted by: Christian Guérin | May 20, 2010 at 11:14 AM
Hi Christian
You'd be better off going to PowerLink and getting the real documentation, but -- from memory -- I think the answer is 4 engines in a cluster, and then federate locally to 2, which is (I guess) a grand total of 8 with this release.
But I am outside my depth here -- go get the docs, or, if you can't get them, I'll post a link here.
-- Chuck
Posted by: Chuck Hollis | May 20, 2010 at 03:34 PM
Chuck... So, what you are saying is that after 2 failed attempts to provide this level of virtualization via Invista, this one is definitely going to work? (right from scratch from what was originally Yotta-Yotta)
Posted by: Adrian F. | June 04, 2010 at 01:22 AM
Hi Adrian F
If you're going to show up with an unreasonably snarky comment like that, please do us all the favor of identifying yourself and your affiliations.
Although Invista wasn't a roaring success, it wasn't a failure either. It did what we wanted it to do. The use cases for VPLEX are very different as well, if you look carefully and resist the temptation to simply throw mud.
Perhaps the biggest problem was that we bet on an intelligent switch design vs. commodity hardware, in essence putting us on the wrong side of the technology curve.
Going forward, we're not gong to make that mistake again.
-- Chuck
Posted by: Chuck Hollis | June 04, 2010 at 06:09 AM
Hi there :0)
When replicating between sites you mentioned it was asynchronously being replicated or it may have been from someone from the postings above. Anyhow, what happens when the link between sites are down for more than 4 hours? Obviously the changes are still at the primary site - what happens to that data? Common knowledge is that the host will slow down on what is write pending to array cache, array cache will fill up and then what? What about my change data that needs to be replicated?
Posted by: Alvin Cly | June 28, 2010 at 04:15 PM
Hi Alvin
Sorry for the delay. I thought I knew the answer, but I wanted to check to make sure. This response comes from Barry Burke (aka The Storage Anarchist):
----------------
Today’s product (VPLEX Metro) provides synchronous replication only…so if the link fails, the default failure mode is identical to what most active/passive synchronous replication configurations support: one side (usually “the primary” site) continues to accept writes while the other stops all I/O.
The site that stays active will keep track of what changes, and when the link comes back, it will re-synchronize these to the remote site. The other site can be manually re-enabled to continue I/O, such as for a DR scenario when the primary site is no longer in operation.
In the future, when asynchronous support is delivered (VPLEX Geo), the actual changes will be managed in a manner similar to Symmetrix SRDF “delta set” change logs.
Like SRDF, these will be able to be buffered to (local) disk if the change sets outgrow memory. The operational models for failures will include configurable responses that mimic that of more traditional Async replication products, as well as several new models specific to the active/active Access Anywhere paradigm that VPLEX Geo will enable.
We’ll discuss these more when VPLEX Geo is announced next year.
---------------------
Hope that helps ...
-- Chuck
Posted by: Chuck Hollis | June 30, 2010 at 08:33 AM