« One Size Does Not Fit All | Main | Sometimes, Marketing Just Happens »

December 19, 2007

Looks Like A Jackalope To Me

Don't know if you've ever taken a car trip through the Rocky Mountains (Colorado, Wyoming, et. al.) but if you stop in a friendly diner, you're likely to see a picture of a "jackalope" -- a monstrous cross between a jackrabbit and an antelope.

Jackalope_rider The locals will tell you with a straight face how the nearby ranchers prefer them to horses since they run much faster and eat less.  Of course, no such thing exists, but it's good fun.

For some reason, there are those in the industry trying to make a connection between an extremely hot topic (server/desktop virtualization) and less-popular topics (storage virtualization, dedupe, thin provisioning, etc.).

Just like the infamous jackalope, the possibility is plausible to tourists, but somewhat of an inside joke to the locals.

The Connection?

No doubt, server/desktop virtualization is probably the single hottest topic in IT these days.  More popular than grid, cloud, SOA, et. al. put together.  And, without a doubt, the parade is being led by VMware -- they own the discussion, no matter how much others try.

And if you're an IT vendor, you're doing anything possible to link yourself to this white-hot trend. 

You're spending time brainstorming with your peers trying to figure out how you're relevant to this trend, and how you can make it appealing to customers and investors.

Sometimes the connections are broad and relevant -- I'd argue strongly that EMC is in this category.  You'd expect that, right?

Other times the connections are more tenuous.

Let's take storage virtualization as an example. 

If you're a storage vendor, you talk a lot about storage virtualization, especially if you don't have much else to talk about.  On one level, you have to, simply because your competitors are doing the same.

Well, there's a shared buzzword between the two concepts, isn't there?  And, if you're not from around these parts, maybe you can be convinced that there's a relevant connection between the two, right?

Well, no, not really.

Servers and Storage Are Different

Servers do computing work, running multiple tasks. 

If one task isn't fully using server resources, there's ample opportunity for another task to come in and use the same resource.  Multitasking (the distant ancestor of server virtualization) has been around since the 1960s.  And the idea lies at the root of the primary economic benefit of VMware, e.g. using less server resources to get the same work done.

At one level, an application wrapped in a VMware container really doesn't care much where it runs -- any server will do.  That leads to additional benefits, like flexibility, easier management, load balancing, etc.

Storage and information have a fundamentally different sharing property. 

Information has to stick around between uses.  Hence the need for storage capacity.

Not to be obvious, but if one application decided to free up needed storage capacity by deleting another application's storage, this would not be OK with the other application owner. 

As an example, my kids recently took it upon themselves to delete my wife's favorite programs off the DVR to make room for their cartoons.  Their logic?  "Well, Mom wasn't using it".

That argument works when you're talking about the TV.  But it doesn't work when you're talking about the DVR.

Now, I Understand How It Can Get Confusing

Do both servers and storage gain consolidation benefits from virtualization? 

Well, technically yes, but the effects are much more pronounced with server virtualization.  By comparison, simple good housekeeping can yield the same consolidation benefits from storage, usually without the need for "virtualization technology".

Does virtualization make it easer to move things around for both servers and storage?

Well, technically yes again, but the benefits are dramatically different.  In a large virtualization server pool, the ability to dynamically balance workloads (think VMware Vmotion, DRS, etc.) is a huge win, simply because there's such a dynamic range in application workloads -- pooling and load-balancing leads to order-of-magnitude benefits, which is why it's such a hot topic.

There's far less dynamic range in storage environments, there's a much more infrequent need to move things around (as compared to application workloads), the economic benefits are less pronounced, and so on.

Simply put, the two technologies can make roughly similar claims, but I think the magnitude of the benefits are vastly different.  And few people dig down deep enough to understand the differences.

And It Can Get Even More Confusing ...

With ESX 3.5, Vmware announced a cool feature -- Storage Vmotion -- which, technically, is a flavor of storage virtualization that runs in the server.  Gee, if VMware is doing storage virtualization, it must be cool, right?  And all the companies doing storage virtualization must be pretty cool, right?

I know, it sounds a little silly, but I've heard this on more than one occasion.

In larger organizations, server virtualization is driven by one particular group with a very specific set of objectives, and -- of course -- everyone is doing it.  Storage virtualization (if it's done at all) is driven by another group with a very different set of objectives.

From my experience, there's almost no overlap.  Perhaps they share a buzzword, but nothing more.

But There's More Craziness Afoot ...

One example that's bothering me is that, recently, NetApp has made much of the fact that their file-oriented data deduplication capability (A-SIS) can reduce storage capacity significantly in VMware environments.

Yes, I'd agree that's technically true.  VMware images (.vmdk files) tend to have a lot of redundant data in them, particulary the binaries.  And they're technically amenable to data deduplication approaches, as compared to information that's already compressed, like zip files and JPEGs.  No argument there.

But they're not telling you a few things -- and I think they should.

First, there's absolutely no discussion about the performance implications of data deduplication on production data, and no amount of hand-waving can make this one go away.  All forms of data dedupe can be thought of as a different service level than physical storage -- plain and simple.  With today's technology, there is no free lunch -- unless your vendor takes you out.

Now, take it from the point of view of the server guy who's trying to get his organization comfortable with using more VMware.  The one thing this person doesn't want is things running noticeably slower on VMware, period.  Most server pros won't want to take this chance, at least not anytime soon.

Save a bit on production storage?  Always a nice thing, yes.  Save a boatload on running my server farm?  Absolutely compelling.  And most people won't knowlingly risk the second to get the first.

And, if you think about it a moment, this is a bit of a red-herring.  If you had 100 server images worth of data before virtualization, you'd also have about 100 server images after virtualization.  Mostly the same kind of data to store, right? 

Did the presence of VMware really change anything significant?

Nope.

I Think There's A Better Target For Storage Savings With VMware

Are there parts of a broad VMware landscape where data dedupe makes more sense?

Yes, absolutely.  Think backup.

For a given production environment, we usually find there's between 3.6x and 20x more capacity in the "backup shadow".  Want to save real money on VMware storage, and not worry about performance?  I'd strongly suggest you'd look there.

I don't have hard numbers, but anecdotally, proficient VMware users tell me that they have a whole lot more virtual machine images sloshing around than they used to.  They tend to keep old ones around, just in case.  Lots of them.

Another reason to think "backup", since this characteristic is turning out to be materially different in the virtualized environment than the physical one.

When we acquired Avamar a while back, they'd optimized their architecture for the VMware environment. 

Their clients ran nicely in a virtual machine. 

They implented a global view of ALL data, and could squeeze far more out of the environment, rather than just looking at isolated chunks as with other approaches. 

They did their dedupe on the server, not the storage, which meant that backups happened far faster and with less expensive plumbing. 

They stored their backup images in native format, meaning it was easy to simply mount an old version of a VMware image from "backup" and go. 

And recently, they've made their back-end run neatly in a virtual machine, meaning that it can run as a virtualized task in your server farm, and not require dedicated hardware, if you so choose.  Sweet!

And the overall storage savings are absolutely astounding as compared to snaps, incrementals, etc. -- all the stuff we're used to.  Most people take an existing VMware server image, make a few changes, put the new one into production, and keep the old one(s) around. 

That's a use case that's extremely amenable to client-side data deduplication, isn't it?

But, let's be honest, there are a few use cases showing up in VMware where people are doing heavy-duty transaction or bandwidth stuff.  The data isn't particularly amenable to dedupe.  There's no time to do data deduplication -- you need blazing performance to move data off, and back on again in a hurry if you need it.

So EMC went a step further, and added Avamar dedupe technology to our NetWorker client.  Now, we can offer a single client for a VMware environment that gives you some interesting service level choices: traditional, snaps, replication, dedupe, backup to disk, et. al.

As your needs change, the backup client (and its management) doesn't.  Dedupe work for you?  Great.  Need something faster?  Great.  Need something even faster?  Fine.  One way to provide multiple service levels, including our popular friend, dedupe.

OK, it's not a single-headline story, I'll grant you that -- it takes a while to explain what's going on, and why it's important.

But, from my albeit slanted view, it's a far more compelling storyline than just a single feature applied to VMware with somewhat dubious customer benefits.

But Really, I Do Understand

VMware is white hot -- although if you look carefully, you'll see NetApp, Dell, HP, Sun, IBM et. al. hedging their bets with Xen, Virtual Iron, Hyper-V, et. al.).

If you're a smaller company (or even a bigger one) you want to hitch your wagon to this rocket ship.  And, in a noisy environment, it's tempting to claim a single feature or buzzword to make what you've got stand out from the crowd.

But does this sort of behavior really create value for customers?

Looks like a jackalope to me ...

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83451be8f69e200e5507b8e5b8834

Listed below are links to weblogs that reference Looks Like A Jackalope To Me:

Comments

He Chuck,
Fell over laughing at the double entendre!

"there is no free lunch -- unless your vendor takes you out"

... vendor takes you out to lunch => free lunch for you.

... vendor takes you out (period) => vendor eats your lunch for free !


Enjoy your columns, as always.
Cheers,
Fred

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

Chuck Hollis


  • Chuck Hollis
    VP -- Global Marketing CTO
    EMC Corporation

    Chuck has been with EMC for 13 years, most of them pretty good.

    He enjoys speaking to customer and industry audiences about a variety of technology topics, and -- of course -- enjoys blogging.

    He lives in Holliston, MA with his wife, three kids and three dogs when he's not travelling. Chuck enjoys piano, mountain biking, boating and skiing -- in that order.

    Warning: do not buy him a drink when there is a piano nearby.

General Housekeeping

  • Frequency of Updates
    I try and write something new 1-2 times per week; less if I'm travelling, more if I'm in the office. Hopefully you'll find the frequency about right!
  • Comments and Feedback
    I'm going to be approving comments before they get posted here. Any information you can share about who you are, how to contact you, what you do for a living, etc. would very much be appreciated.