I've always been amazed at the different ways you can slice the storage market: by access method (DAS, SAN, NAS, CAS, etc.), by architecture (single controller, dual controller, multi-controller, RAIN, scale-out, clusters, etc.), even by consumption model (e.g. traditional vs. storage-as-a-service).
To this growing list of taxonomies, I think we're going to have to add another: pre-integrated storage vs. do-it-yourself.
And, strangely, I think that there will be certain places where this is going to be popular. But most organizations will probably never consider it seriously.
Here's why ...
So, What Prompted This?
A recent announcement by Sun around "open source storage" (???), basically offering people their open-sourced ZFS running on a Sun server, presumably a Thumper.
Now, I'm not going to comment as to the true geneology of ZFS (that's one for the lawyers), or whether Sun can be successful in this sort of business model (I'm dubious).
What I find interesting is the growing number of offerings in this relatively new category of "just add your favorite server hardware" storage: DataCore, LeftHand and probably a bunch of others I forgot.
So, what's going on here?
The Siren Song Plays Again
There seems to be a basic pitch to this newer category that I see repeated
- Get the storage functionality you want (e.g. NAS, SAN, replication, etc.) via software
- Use any server hardware of your choosing
- Save money
- Avoid "vendor lock in"
Storage functionality? Sure, you can get a decent subset of the best that the array vendors offer via a software-only model -- I'll grant you that. And, if that subset is what you'll need for the forseeable future, check this box and move on.
Any server hardware? Yes, sort of. Most of the vendors support qualified configurations (rather than just anything) in an effort to provide a better customer experience. Sun specifically is steering people at their server hardware (duh), but I guess -- theoretically -- you have a lot of options here.
Here's where I turn a bit skeptical.
First, let's look at hardware costs. In our business, parts are parts. Disk drives cost pretty much the same for everyone, as do the processors and RAM we all use, and so on. Server vendors and storage vendors largely draw from the exact same parts bins, so -- based on everything I've ever seen -- there's no real cost advantage for servers in like-for-like.
One problem is getting like-for-like. As an example, if you use server-based technology to implement dual redundant controllers, RAID 5 protection and the like, you'll end up with a server that -- well -- starts costing the same as (or usually more!) than a low-end storage array.
OK, maybe you don't need all that protection and redundancy with your storage. Fine. But, before you get too excited about this approach, do a clear-eyed cost comparison around usable, protected storage, and you may be surprised as to what's the low-cost option.
The other problem is scale. With most server designs, there are only so many drives that can go behind a motherboard. Fine for entry and mid-level, but if you're talking hundreds -- or thousands -- of drives, you'll end up wasting a lot of sheet metal, power supplies, RAM, processors, etc.
Fine, don't believe me, just work the numbers for a truly large configuration, and you'll notice the effect.
Next, let's look at software costs. As an example, EMC, NetApp, HDS, IBM et. al. charge for their software, either as a specific line item, or as part of the hardware cost, as the hardware isn't really usable without the software.
People like DataCore and LeftHand charge explicitly for their software, since there's no hardware in their primary models.
And Sun has decided to give the software away. OK, I'll grant you that's pretty cheap. In the world of open-source storage software, clearly software costs are much lower, aren't they?
Now, let's talk about support costs. I'm talking things like qualifications and interoperability testing (think EMC's eLab), performance and use case testing -- all which would ideally be done before the customer gets involved. Not usually the case for open source software, is it?
And, let's not forget, we have to be very clear about customer support if and when you've got a problem. For traditional storage (e.g. EMC, HDS, IBM, NetApp, et. al.) the support model is pretty clear -- sure, we could argue who's support is better, but at least the model is pretty consistent.
The software-only vendors can control what goes on with their software, but don't have as much control with the hardware they're running on. Despite everyone's best intentions, there's an opportunity for a bit of vendor crossfire between the server vendors and the storage software vendor. I see that as a "cost" that has to be accounted for.
And the open source model? I'm not sure who you're gonna call when you're having a bad storage day, or how responsive they'll be once you get someone on the phone. Sure, there are a few organizations who have very strong technical bench strength on these topics, and are willing to invest some of their cycles making stuff work, or fixing it when it's broken.
But, from where I sit, that's more the exception than the rule.
I really, really struggle with this concept, I do. Here's why:
Anything I use and get comfortable with -- well, I'm "locked in" to a certain degree. If I use a lot of storage software X; well, I'm sorta locked in, aren't I? Or, if I put my servers-as-storage on a three-year lease, I'm kind of locked in, aren't I?
All storage solutions support relatively standard interfaces and protocols. Unless you use certain advanced features, it's pretty easy (data migration aside!) to move from one to the other. And, in the storage array business, customers swap vendors all the time if they feel the need.
Now, imagine I write some custom scripting or interfaces into something like ZFS, well -- I"m locked in to a certain degree, aren't I?
It just strikes me as posturing with little -- if any -- basis in reality.
A Personal Example
I had been casting around for a home storage sharing device for a few years. I fooled around a bit with various Linux combos, even did the Microsoft thing. Way too fiddly for me, and I was spending more time making things work rather than enjoying what the platform could do.
I mentioned earlier that I got one of those LifeLine-based Intel devices, plugged it in, and got on to actually using the darn thing, rather than tinkering around.
Did I end up spending more than if I got very creative with eBay and SourceForge? Of course I did.
But I had better things to do in my spare time.
So, What Does This Mean For The Storage Industry?
We're seeing a new category forming rapidly: do-it-yourself storage. Nothing wrong with that. And I can imagine a few situations where that'd be very interesting to someone.
But, at the same time, I don't think it's going to change much out there. True cost are true costs -- hardware, software and support -- no matter how you re-arrange the buckets. Take something out of one bucket, it often ends up in another bucket. Or you end up worse off than where you started.
So, Why Am I Cringing?
Because we're probably going to hear even more nonsensical blather about this stuff in the coming months.
This particular topic is perfect for people trying to crash the party with a "new idea" (smaller vendors, certain industry pundits and curmudgeons, and so on) that really isn't a new idea at all.
If you think about it, people have been using servers as storage for more than a decade -- think Windows CIFS and NFS as simple examples. So there's nothing really new here -- customers have always had the option to press their servers into duty as shared storage devices.
Maybe they didn't like the performance, or the functionality, or the cost model, or the support model, so -- by and large -- they've been moving away from this approach for quite a while.
Which is why -- by and large -- storage arrays are so popular. They're built and designed to do a specific job, and do it well.
And I don't see this changing anytime soon.