Well, by now just about every major storage player offers some form of thin provisioning (including EMC).
And to read the marketing hype from all the smaller vendors, you'd think this is really big stuff.
In this post, I'll try and give some context to why this is so interesting to some, and -- more importantly -- the dark side that deserves a bit of light on it.
Context
In the never-ending quest to improve storage utilization, people hit on the idea of "thin provisioning". Very simply, you allocate what looks like a large amount of storage to the server or application, but it doesn't physically get used until something gets written.
User sees 100 GB. Disk array allocates 10GB. At least, initially (!)
So why is this such a big deal?
Well, it turns out that storage management discipline isn't so great in many shops. Users tend to -- ahem -- inflate their requirements when talking to the storage guys, so a lot of storage goes wasted.
I would offer the opinion that if reasonable storage management policies were in place, there wouldn't be quite so much interest, but that's another story.
Another argument is that when a file system fills up, it's a hassle to extend it, so just allocate a ginormous virtual volume so you don't have to do it again for a while. Again, most modern filesystems allow dynamic extensions, so that isn't much of an argument, IMHO.
The dark side
Now, one of the things you give up with this approach is defining exactly where the physical storage lives. Most approaches have a pool; when more physical storage is needed, it goes to the pool and gets some more.
Makes sense for moderate service level environments like file systems, where performance may not be a concern, but I definitely would be leery of using this approach with something like a banging Oracle environment, or something that makes the disk drives light up like Exchange.
In high performance environments, you want to know exactly what spindle your data will be ending up on. Scattering it all around will make it nigh-impossible to optimize performance, should you need to.
Another thing you're signing up for is additional watchfulness from the storage administrator. Like the guy who writes more checks than money in the bank, you don't want to bounce a check. And, in this environment, we're usually talking a hard crash when someone does a write and the storage environment says "sorry, no can do". Not a lot of graceful recovery logic around that one.
Now, EMC has taken the step of trying to minimize some of this with a system of pools and thresholds and alerts to warn you of impending doom if you're running out of physical storage, but it assumes that someone, somewhere, is paying attention.
Now, picture this. You've overprovisioned your filesystems to all your users. They're happy. Things start filling up. You realize you better get some more storage. Ooops, no budget, or it's stuck in some other tangled-up corporate process.
Meanwhile, your users are merrily storing data not realizing that impending doom is upon them. You can't go back to the users and say "hey, we're running out of storage, cool it!". They look at their filesystems, and say "I don't see a problem" and keep going. It's all coming out of the same shared pool.
And then -- one day -- everything crashes to the ground -- all at once -- because all the physical storage is gone.
Don't write a check you can't cash.
And then there's the interesting moral question of charge-back: do you charge back for the physical utilization, or the virtual monster you've given them? And do your tools understand the difference?
My real beef
I think thin provisioning is not-a-good-thing at a philisophical level. It has a role, but I'd recommend using it very carefully, if at all.
First, I would argue that organizations should be a bit more adept at storage management ITIL processes, and not have to resort to "fake out" technology to solve what could be potentially construed as an organizational problem.
Second, I think it masks the real problem, which is that no one has a good idea what all that data is actually being used for. Is it really needed? Could it be archived? Could it be de-duped?
We do a lot of file system assessments as part of our normal day-to-day business at EMC. And the stories of what we find are nothing short of amazing.
Want to save 50-70% on your file system storage costs? Move the junk off. It's that simple. The technology to automate this is pretty straightforward.
No fake-out technology needed, and it's moving you in the right direction.
Conclusion
So, thin provisioning will join the ever-growing arsenal of tools we have in how people manage their storage. I'm sure there will be success stories, and more than a few horror stories.
Over time, people will realize -- like any tool -- there are places where it makes sense, and places that it doesn't.
But -- please -- let's cool it with the breathless hype. It's not doing anyone any good.
Chuck, I agree with the hype comment however there are and were times when thin provisioning can be useful. I first used it in 1996 on mainframe systems with STK Iceberg. For volumes such as paging disks where we didn't want more than a small file on there, it was perfect. The issue was concurrent access to a volume (now solved with PAVs) so allocating more smaller used volumes was a simple way to solve the problem.
The Iceberg implementation wasn't without problems, most notably the reporting didn't cope well with snapshots and could underreport the amount of used storage.
Posted by: Chris M Evans | December 20, 2006 at 05:57 PM
Chuck:
I'm a storage VAR in NJ, and I find it refreshing to see that a storage industry exec sees things the same way I do on the topic of thin-privisioning. I see this very much as a solution that was invented (more likely discovered) without a valid problem to attach itself to.
Would you build a house on a foundation that was 'thin-provisioned', not knowing if the builder really put enough concrete into it, only to find out once your house crumbles to the ground?
Seems like total folly to me, and with the ability to both grow and shrink virtual volumes these days, it's largely unnecessary, and I'm usually able to talk my clients out of trying it.
Great article, IMHO.
Glenn Dekhayser
Voyant Strategies, Inc.
Posted by: Glenn Dekhayser | June 15, 2007 at 10:20 PM
I totally agreed with what you said. I think the small player is using this as a selling tools more than benefits the end user really. How much space you going to save with the disk is getting bigger and bigger and cheaper and cheapr. It's like you spending $10 to save $12 of disk space, good job, but have to absorb all the performance, management issue over time.
Posted by: tc | September 16, 2007 at 01:22 AM