« SBB: Why You Just Might Care | Main | There Ought To Be A Law »

December 13, 2006

Thin Provisioning: Don't Write A Check You Can't Cash

Well, by now just about every major storage player offers some form of thin provisioning (including EMC). 

And to read the marketing hype from all the smaller vendors, you'd think this is really big stuff.

In this post, I'll try and give some context to why this is so interesting to some, and -- more importantly -- the dark side that deserves a bit of light on it.

Context

In the never-ending quest to improve storage utilization, people hit on the idea of "thin provisioning".  Very simply, you allocate what looks like a large amount of storage to the server or application, but it doesn't physically get used until something gets written.

User sees 100 GB.  Disk array allocates 10GB.  At least, initially (!)

So why is this such a big deal?

Well, it turns out that storage management discipline isn't so great in many shops.  Users tend to -- ahem -- inflate their requirements when talking to the storage guys, so a lot of storage goes wasted. 

I would offer the opinion that if reasonable storage management policies were in place, there wouldn't be quite so much interest, but that's another story. 

Another argument is that when a file system fills up, it's a hassle to extend it, so just allocate a ginormous virtual volume so you don't have to do it again for a while.  Again, most modern filesystems allow dynamic extensions, so that isn't much of an argument, IMHO.

The dark side

Now, one of the things you give up with this approach is defining exactly where the physical storage lives.  Most approaches have a pool; when more physical storage is needed, it goes to the pool and gets some more. 

Makes sense for moderate service level environments like file systems, where performance may not be a concern, but I definitely would be leery of using this approach with something like a banging Oracle environment, or something that makes the disk drives light up like Exchange.

In high performance environments, you want to know exactly what spindle your data will be ending up on.  Scattering it all around will make it nigh-impossible to optimize performance, should you need to.

Another thing you're signing up for is additional watchfulness from the storage administrator.  Like the guy who writes more checks than money in the bank, you don't want to bounce a check.  And, in this environment, we're usually talking a hard crash when someone does a write and the storage environment says "sorry, no can do".  Not a lot of graceful recovery logic around that one.

Now, EMC has taken the step of trying to minimize some of this with a system of pools and thresholds and alerts to warn you of impending doom if you're running out of physical storage, but it assumes that someone, somewhere, is paying attention.

Now, picture this.  You've overprovisioned your filesystems to all your users.  They're happy.  Things start filling up.  You realize you better get some more storage.  Ooops, no budget, or it's stuck in some other tangled-up corporate process. 

Meanwhile, your users are merrily storing data not realizing that impending doom is upon them.  You can't go back to the users and say "hey, we're running out of storage, cool it!".  They look at their filesystems, and say "I don't see a problem" and keep going.  It's all coming out of the same shared pool.

And then -- one day -- everything crashes to the ground -- all at once -- because all the physical storage is gone.

Don't write a check you can't cash.

And then there's the interesting moral question of charge-back: do you charge back for the physical utilization, or the virtual monster you've given them?  And do your tools understand the difference?

My real beef

I think thin provisioning is not-a-good-thing at a philisophical level.  It has a role, but I'd recommend using it very carefully, if at all.

First, I would argue that organizations should be a bit more adept at storage management ITIL processes, and not have to resort to "fake out" technology to solve what could be potentially construed as an organizational problem.

Second, I think it masks the real problem, which is that no one has a good idea what all that data is actually being used for.  Is it really needed?  Could it be archived?  Could it be de-duped? 

We do a lot of file system assessments as part of our normal day-to-day business at EMC.  And the stories of what we find are nothing short of amazing. 

Want to save 50-70% on your file system storage costs?  Move the junk off.  It's that simple.  The technology to automate this is pretty straightforward. 

No fake-out technology needed, and it's moving you in the right direction.

Conclusion

So, thin provisioning will join the ever-growing arsenal of tools we have in how people manage their storage.  I'm sure there will be success stories, and more than a few horror stories.

Over time, people will realize -- like any tool -- there are places where it makes sense, and places that it doesn't.

But -- please -- let's cool it with the breathless hype.  It's not doing anyone any good.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83451be8f69e200e5506760668833

Listed below are links to weblogs that reference Thin Provisioning: Don't Write A Check You Can't Cash:

Comments

Chuck, I agree with the hype comment however there are and were times when thin provisioning can be useful. I first used it in 1996 on mainframe systems with STK Iceberg. For volumes such as paging disks where we didn't want more than a small file on there, it was perfect. The issue was concurrent access to a volume (now solved with PAVs) so allocating more smaller used volumes was a simple way to solve the problem.

The Iceberg implementation wasn't without problems, most notably the reporting didn't cope well with snapshots and could underreport the amount of used storage.

Chuck:

I'm a storage VAR in NJ, and I find it refreshing to see that a storage industry exec sees things the same way I do on the topic of thin-privisioning. I see this very much as a solution that was invented (more likely discovered) without a valid problem to attach itself to.

Would you build a house on a foundation that was 'thin-provisioned', not knowing if the builder really put enough concrete into it, only to find out once your house crumbles to the ground?

Seems like total folly to me, and with the ability to both grow and shrink virtual volumes these days, it's largely unnecessary, and I'm usually able to talk my clients out of trying it.

Great article, IMHO.

Glenn Dekhayser
Voyant Strategies, Inc.

I totally agreed with what you said. I think the small player is using this as a selling tools more than benefits the end user really. How much space you going to save with the disk is getting bigger and bigger and cheaper and cheapr. It's like you spending $10 to save $12 of disk space, good job, but have to absorb all the performance, management issue over time.

Verify your Comment

Previewing your Comment

This is only a preview. Your comment has not yet been posted.

Working...
Your comment could not be posted. Error type:
Your comment has been saved. Comments are moderated and will not appear until approved by the author. Post another comment

The letters and numbers you entered did not match the image. Please try again.

As a final step before posting your comment, enter the letters and numbers you see in the image below. This prevents automated programs from posting comments.

Having trouble reading this image? View an alternate.

Working...

Post a comment

Comments are moderated, and will not appear until the author has approved them.

Chuck Hollis


  • Chuck Hollis
    VP -- Global Marketing CTO
    EMC Corporation

    Chuck has been with EMC for 13 years, most of them pretty good.

    He enjoys speaking to customer and industry audiences about a variety of technology topics, and -- of course -- enjoys blogging.

    He lives in Holliston, MA with his wife, three kids and three dogs when he's not travelling. Chuck enjoys piano, mountain biking, boating and skiing -- in that order.

    Warning: do not buy him a drink when there is a piano nearby.

General Housekeeping

  • Frequency of Updates
    I try and write something new 1-2 times per week; less if I'm travelling, more if I'm in the office. Hopefully you'll find the frequency about right!
  • Comments and Feedback
    I'm going to be approving comments before they get posted here. Any information you can share about who you are, how to contact you, what you do for a living, etc. would very much be appreciated.