Well, by now just about every major storage player offers some form of thin provisioning (including EMC).
And to read the marketing hype from all the smaller vendors, you'd think this is really big stuff.
In this post, I'll try and give some context to why this is so interesting to some, and -- more importantly -- the dark side that deserves a bit of light on it.
In the never-ending quest to improve storage utilization, people hit on the idea of "thin provisioning". Very simply, you allocate what looks like a large amount of storage to the server or application, but it doesn't physically get used until something gets written.
User sees 100 GB. Disk array allocates 10GB. At least, initially (!)
So why is this such a big deal?
Well, it turns out that storage management discipline isn't so great in many shops. Users tend to -- ahem -- inflate their requirements when talking to the storage guys, so a lot of storage goes wasted.
I would offer the opinion that if reasonable storage management policies were in place, there wouldn't be quite so much interest, but that's another story.
Another argument is that when a file system fills up, it's a hassle to extend it, so just allocate a ginormous virtual volume so you don't have to do it again for a while. Again, most modern filesystems allow dynamic extensions, so that isn't much of an argument, IMHO.
The dark side
Now, one of the things you give up with this approach is defining exactly where the physical storage lives. Most approaches have a pool; when more physical storage is needed, it goes to the pool and gets some more.
Makes sense for moderate service level environments like file systems, where performance may not be a concern, but I definitely would be leery of using this approach with something like a banging Oracle environment, or something that makes the disk drives light up like Exchange.
In high performance environments, you want to know exactly what spindle your data will be ending up on. Scattering it all around will make it nigh-impossible to optimize performance, should you need to.
Another thing you're signing up for is additional watchfulness from the storage administrator. Like the guy who writes more checks than money in the bank, you don't want to bounce a check. And, in this environment, we're usually talking a hard crash when someone does a write and the storage environment says "sorry, no can do". Not a lot of graceful recovery logic around that one.
Now, EMC has taken the step of trying to minimize some of this with a system of pools and thresholds and alerts to warn you of impending doom if you're running out of physical storage, but it assumes that someone, somewhere, is paying attention.
Now, picture this. You've overprovisioned your filesystems to all your users. They're happy. Things start filling up. You realize you better get some more storage. Ooops, no budget, or it's stuck in some other tangled-up corporate process.
Meanwhile, your users are merrily storing data not realizing that impending doom is upon them. You can't go back to the users and say "hey, we're running out of storage, cool it!". They look at their filesystems, and say "I don't see a problem" and keep going. It's all coming out of the same shared pool.
And then -- one day -- everything crashes to the ground -- all at once -- because all the physical storage is gone.
Don't write a check you can't cash.
And then there's the interesting moral question of charge-back: do you charge back for the physical utilization, or the virtual monster you've given them? And do your tools understand the difference?
My real beef
I think thin provisioning is not-a-good-thing at a philisophical level. It has a role, but I'd recommend using it very carefully, if at all.
First, I would argue that organizations should be a bit more adept at storage management ITIL processes, and not have to resort to "fake out" technology to solve what could be potentially construed as an organizational problem.
Second, I think it masks the real problem, which is that no one has a good idea what all that data is actually being used for. Is it really needed? Could it be archived? Could it be de-duped?
We do a lot of file system assessments as part of our normal day-to-day business at EMC. And the stories of what we find are nothing short of amazing.
Want to save 50-70% on your file system storage costs? Move the junk off. It's that simple. The technology to automate this is pretty straightforward.
No fake-out technology needed, and it's moving you in the right direction.
So, thin provisioning will join the ever-growing arsenal of tools we have in how people manage their storage. I'm sure there will be success stories, and more than a few horror stories.
Over time, people will realize -- like any tool -- there are places where it makes sense, and places that it doesn't.
But -- please -- let's cool it with the breathless hype. It's not doing anyone any good.