« Single Vs. Multiple Vendor Approaches | Main | Travelling ... »

April 29, 2008

Do-It-Yourself Storage

I've always been amazed at the different ways you can slice the storage market: by access method (DAS, SAN, NAS, CAS, etc.), by architecture (single controller, dual controller, multi-controller, RAIN, scale-out, clusters, etc.), even by consumption model (e.g. traditional vs. storage-as-a-service).

To this growing list of taxonomies, I think we're going to have to add another: pre-integrated storage vs. do-it-yourself.

And, strangely, I think that there will be certain places where this is going to be popular.  But most organizations will probably never consider it seriously.

Here's why ...

So, What Prompted This?

A recent announcement by Sun around "open source storage" (???), basically offering people their open-sourced ZFS running on a Sun server, presumably a Thumper.

Now, I'm not going to comment as to the true geneology of ZFS (that's one for the lawyers), or whether Sun can be successful in this sort of business model (I'm dubious).

What I find interesting is the growing number of offerings in this relatively new category of "just add your favorite server hardware" storage: DataCore, LeftHand and probably a bunch of others I forgot.

So, what's going on here?

The Siren Song Plays Again

There seems to be a basic pitch to this newer category that I see repeated

-  Get the storage functionality you want (e.g. NAS, SAN, replication, etc.) via software
-  Use any server hardware of your choosing
-  Save money
-  Avoid "vendor lock in"

Storage functionality?  Sure, you can get a decent subset of the best that the array vendors offer via a software-only model -- I'll grant you that.  And, if that subset is what you'll need for the forseeable future, check this box and move on.

Any server hardware?  Yes, sort of.  Most of the vendors support qualified configurations (rather than just anything) in an effort to provide a better customer experience.  Sun specifically is steering people at their server hardware (duh), but I guess -- theoretically -- you have a lot of options here.

Save Money?

Here's where I turn a bit skeptical.

First, let's look at hardware costs.  In our business, parts are parts.  Disk drives cost pretty much the same for everyone, as do the processors and RAM we all use, and so on.  Server vendors and storage vendors largely draw from the exact same parts bins, so -- based on everything I've ever seen -- there's no real cost advantage for servers in like-for-like.

One problem is getting like-for-like.  As an example, if you use server-based technology to implement dual redundant controllers, RAID 5 protection and the like, you'll end up with a server that -- well -- starts costing the same as (or usually more!) than a low-end storage array.

OK, maybe you don't need all that protection and redundancy with your storage.  Fine.  But, before you get too excited about this approach, do a clear-eyed cost comparison around usable, protected storage, and you may be surprised as to what's the low-cost option.

The other problem is scale.  With most server designs, there are only so many drives that can go behind a motherboard.  Fine for entry and mid-level, but if you're talking hundreds -- or thousands -- of drives, you'll end up wasting a lot of sheet metal, power supplies, RAM, processors, etc. 

Fine, don't believe me, just work the numbers for a truly large configuration, and you'll notice the effect.

Next, let's look at software costs.  As an example, EMC, NetApp, HDS, IBM et. al. charge for their software, either as a specific line item, or as part of the hardware cost, as the hardware isn't really usable without the software.

People like DataCore and LeftHand charge explicitly for their software, since there's no hardware in their primary models. 

And Sun has decided to give the software away.  OK, I'll grant you that's pretty cheap.  In the world of open-source storage software, clearly software costs are much lower, aren't they?

Now, let's talk about support costs.  I'm talking things like qualifications and interoperability testing (think EMC's eLab), performance and use case testing -- all which would ideally be done before the customer gets involved.   Not usually the case for open source software, is it?

And, let's not forget, we have to be very clear about customer support if and when you've got a problem.  For traditional storage (e.g. EMC, HDS, IBM, NetApp, et. al.) the support model is pretty clear -- sure, we could argue who's support is better, but at least the model is pretty consistent.

The software-only vendors can control what goes on with their software, but don't have as much control with the hardware they're running on.  Despite everyone's best intentions, there's an opportunity for a bit of vendor crossfire between the server vendors and the storage software vendor.  I see that as a "cost" that has to be accounted for.

And the open source model?  I'm not sure who you're gonna call when you're having a bad storage day, or how responsive they'll be once you get someone on the phone.  Sure, there are a few organizations who have very strong technical bench strength on these topics, and are willing to invest some of their cycles making stuff work, or fixing it when it's broken.

But, from where I sit, that's more the exception than the rule.

Vendor Lock-In?

I really, really struggle with this concept, I do.  Here's why:

Anything I use and get comfortable with -- well, I'm "locked in" to a certain degree.  If I use a lot of storage software X; well, I'm sorta locked in, aren't I?  Or, if I put my servers-as-storage on a three-year lease, I'm kind of locked in, aren't I?

All storage solutions support relatively standard interfaces and protocols.  Unless you use certain advanced features, it's pretty easy (data migration aside!) to move from one to the other.  And, in the storage array business, customers swap vendors all the time if they feel the need.

Now, imagine I write some custom scripting or interfaces into something like ZFS, well -- I"m locked in to a certain degree, aren't I?

It just strikes me as posturing with little -- if any -- basis in reality.

A Personal Example

I had been casting around for a home storage sharing device for a few years.  I fooled around a bit with various Linux combos, even did the Microsoft thing.  Way too fiddly for me, and I was spending more time making things work rather than enjoying what the platform could do.

I mentioned earlier that I got one of those LifeLine-based Intel devices, plugged it in, and got on to actually using the darn thing, rather than tinkering around. 

Did I end up spending more than if I got very creative with eBay and SourceForge?  Of course I did. 

But I had better things to do in my spare time.

So, What Does This Mean For The Storage Industry?

We're seeing a new category forming rapidly: do-it-yourself storage.  Nothing wrong with that.  And I can imagine a few situations where that'd be very interesting to someone.

But, at the same time, I don't think it's going to change much out there.  True cost are true costs -- hardware, software and support -- no matter how you re-arrange the buckets.  Take something out of one bucket, it often ends up in another bucket.  Or you end up worse off than where you started.

So, Why Am I Cringing?

Because we're probably going to hear even more nonsensical blather about this stuff in the coming months. 

This particular topic is perfect for people trying to crash the party with a "new idea" (smaller vendors, certain industry pundits and curmudgeons, and so on) that really isn't a new idea at all.

If you think about it, people have been using servers as storage for more than a decade -- think Windows CIFS and NFS as simple examples.  So there's nothing really new here -- customers have always had the option to press their servers into duty as shared storage devices.

Maybe they didn't like the performance, or the functionality, or the cost model, or the support model, so -- by and large -- they've been moving away from this approach for quite a while.

Which is why -- by and large -- storage arrays are so popular.  They're built and designed to do a specific job, and do it well.

And I don't see this changing anytime soon.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/1106103/28609860

Listed below are links to weblogs that reference Do-It-Yourself Storage:

Comments

I think you are missing the point. I did actually a price comparison. Building your storage solution (in my case several hundred TBs) from cheap disks using x86 servers + ZFS + Open Solaris + Solaris Cluster + ZFS, where all software is not only open sourced it is also for free, *does* make a huge difference. We actually started building our solution on EMC Symmetrix (great box) and EMC Celerra years ago and endup on really cheap storage + ZFS as a replacement and a way to move forward. Additionally all features like snapshots, cloning, end-to-end checksuming, remote replication, built-in compression, built-in cryptography, NFS, CIFS, iSCSI, ... are also for free. Better - they work exactly the same regardles what cheap storage or server we put underneath.

What ZFS brings to the market is the open sourced and free Google like approach to storage - how to cheaply build reliable storage from small to large scale installations.

Sure, especially for SMB market, what is needed is an easy GUI interface built on-top of Solaris + ZFS. I'm sury you will see one sooner or later.

Thanks for the perspective?

Questions:

are you configured for HA (e.g. redundant paths to data, RAID, multiple controllers), or are you just looking for capacity?

What do you do for support? Do you call your vendor when you've got an issue, or do you do most of the support and problem resolution yourself?

Finally -- and most important -- what are you using the storage for? Are we talking file sharing, or supporting an OLTP environment?

Any additional insights would be appreciated -- thanks!

Good past Chuck, as usual you put a lot out there to discuss.

To me, the fact that this is Sun and not HP, IBM, Microsoft or anybody else makes this interesting. I sort of doubt they know where this is going, but they seem determined to let it play out. This seems to fit their culture and skills - although the business is questionable.

My hunch is that the support end of this will choke Sun's zeal as implementations increase. Support could come from other vendors who implement the technology in their products, but only if the licensing can work.

Chuck - Here's the simple point to Open Storage that Robert is making above: Free software + Market priced disk = Huge savings

Then add Sun Service (which services the same types of customers as EMC) PLUS 3,000 storage community members for support.

Even more, developers and Web 2.0 companies now have an entire storage software stack to use - they don't have to build their own. They can focus on developing their own software on top of a solid platform.

Wow! Your exuberance is almost contagious ...

Look, I'm not arguing that a do-it-yourself strategy can save money.

It's true for many things: selling your house, getting legal work done, perhaps a bit of minor surgery.

Especially if you're not too concerned about the outcome, or happen to have some unique expertise in what you're considering.

My argument is that I don't see too many sane people actually doing it. And, yes, there will be certain use cases where people will look at this and say "makes sense for me".

I wish them the very best!

One minor note: you're claiming that Sun will offer the same level of mission critical support for consumers of this open source stack as those who purchase, say, your HDS-rebadged array?

I find that very, very hard to believe.

Serious support skills cost real money. The idea that Sun would generate sufficient income from for open source software is an economic virtual motion machine. But that won't keep Sun and others from trying.
http://en.wikipedia.org/wiki/History_of_perpetual_motion_machines

Amazon and Google wouldn't have a business if they didn't do storage themselves - also talk to any Web 2.0 start up trying to differentiate their business through IT...

Well, certainly Google has a business (they sell advertising), but it's not clear if Amazon's S3 really is a business, or a charity.

As far as Web 2.0 startups -- interesting segment, to be sure -- but not exactly heavy investors in serious IT infrastructure, are they?

I think customers will decide -- one way or another -- how they're going to approach IT.

But -- please -- explain to me -- how does Sun make money by giving away software and selling commoditized servers?

Maybe they make it up on volume?

The Sun open source storage strategy is discussed here: http://www.blocksandfiles.com/article/4975

My view is that it is not a disruptive storage technologybut it might become one.
Chris.

ha ha ha - make it up on volume!

Chuck - depends on a solution I deployed. Quite often I do use multi-pathing using MPxIO which is delivered for free with Solaris (and yes, it even does work with EMC storage), sometimes we are talking about Sun's Thumpers (aka x4500).

When it comes to support - HW suport is easy, I just buy support from the vendor. Software support - well, you buy support for from Sun for Solaris which does cover ZFS, MPxIO, etc. and is relatively cheap.
At the same time, yes you do need specific skills and be able to support it yourself to some degree, especially with biggier environments. But it is also true if you go with traditional approach (I remember how much time it took me to come up with valiable backup approach to Symmetrix + Celerra with a lot of small files - other than replicating to another pair of Symmetrix/Cellera - I ended up with in-house and unsupported by EMC approach which actually worked well).

Now you asking about OLTP - of course, I've deployed many MySQL+Solaris x86 + cheap JBODS or arrays + ZFS - except for hardware everything else is for free and works really well.

I'm not saying that aproach is the best one in all cases - of course it is not. But in many cases it is and it is a big money saver.

Well, last time I checked there was money in helping customers build better, more affordable storage systems - at least that's what Sun is betting on ;-)

And I wouldn't be so quick to discredit Web 2.0 companies. Today's upstarts can be tomorrow's key players - a recent Forrester survey also found 1/3 of traditional companies are deploying Web 2.0 applications (in fact, we're using one).

And if you think they are not heavy IT investors, well I guess that's the whole point of open storage - they can't afford to deploy traditional storage architectures, so they need to leverage better storage economics...

Come to think of it, maybe that's a good message for traditional customers as well?

Hope springs eternal at Sun, doesn't it? I guess having an optimistic outlook really helps.

But the people I'm talking to have some serious unanswered questions around all of this, which you've declined to answer:

1 -- how is Sun's offering different than other open source storage offerings?

2 -- is there a fully delineated model regarding how support responsibilities differ in an open source approach than a traditional approach? Do I get enterprise class storage support, or I get to replace failed drives using FedEx, or do I throw myself on the mercy of the "community" when I'm up a creek without a paddle? It's not clear to anyone I talk to.

3 -- What happens when your "open source" software becomes the target of an IP suit? More specifically, where can that leave customers? And, Taylor, as you know, this isn't a hypothetical situation these days.

Best of luck -- and all credit -- for trying something new over at Sun. If nothing else, you guys are very creative.

Now you just have to be successful!

Hi Chuck,
Very Well Explained. Creativity Vs Successful.
EMC, NetApp, IBM systems with support issues, they got guys to fix it onsite.

How will the Sun support engineer troubleshoot open storages?
Sorry Mr.Customer ... DO-IT-YOURSELF

Infact Sun did not do anything new. Even DAS (Direct Attached Storages) are do-it-yourself.
Also remember SAMBA - Open NAS.

Sorry Sun. Innovate something spectacular!

Chuck - great questions.

Our open source offerings are different in a couple ways:

1. Our scope - we have opened the entire stack, from drivers to higher-level apps like snapshots and mirroring. We offer high level apps like SAM (HSM) and honeycomb (Object archive), as well as COMSTAR unified target code which turns a server into a block storage device.

2. Our quality and functionality - read the testimonials from DigiTar and Nexenta about Solaris and ZFS as a storage platform.

3. The entire package - unique Servers, Storage, Service, Solaris, ZFS and OpenSolaris.

Open source doesn't always require new support models as well. EMC's Centera ships with open source Linux on every node - you're not leaving them up the creek without a paddle, right?

Also don't discount support from a community - here is what DigiTar (a Linux shop btw) said about Open Storage in their blog - "you’ll also find an community around OpenSolaris that is by far the friendliest and most mature open source group of folks you’ve ever dealt with."

SunSpectrum customers can find support for OpenSolaris and customers can buy RTUs of open source projects if they desire traditional support - the market has worked out service for open systems and open source. It's a long subject - I'll blog more on this later, but this is a great topic and a great place to differentiate as well.

I'm glad you asked about IP suites as well. Sun indemnifies customers who use Sun open source. This is unique. IP suits hit traditional software as well keep in mind. What happens to EMC customers when EMC is the target of an IP lawsuit? Do you indemnify them?

Clarification: My apologies, Sun offers indemnification for the commercial version of our open source platform, i.e. Solaris. See Jonathan's blog mentioning it first here: http://blogs.sun.com/jonathan/entry/an_open_letter_to_sam1

"From Federal Express to Verisign, SAP and Oracle to Siebel, Veritas and BEA - from across the globe and marketplace - there is tremendous demand and support. They love that we're open sourcing Solaris, and that we'll be the first open source vendor to offer a commercial version of our product with indemnification against intellectual property lawsuits."

Hi Taylor

There comes a point when someone goes from simply diasagreeing, to being disagreeable. I think you're crossing that line here.

Of course, EMC indemnifies customers regarding IP issues. And, of course, EMC provides 24x7 enterprise-class support, regardless of how the product is constructed.

I think Job #1 at Sun is becoming profitable again, especially with regards to storage.

Since the key distinction that you're drawing here is essentially an economic one (that is, you don't seem to be making a technical argument against ZFS), what makes you think that Sun can't take these components and integrate and support them? And even if you discount Sun (for lots of good reasons, I might add), what is to stop a start-up from doing it?

Have to agree with Chuck here -- I can't see how you are going to achieve 5 or even 3 nines reliability unless you perform full DVT testing (and bug fixing) against a revision controlled HW/SW platform. This means locking down the HW rev of your logic board, all instances of microcode such as what runs inside the Fibre HBA chips as well as BIOS, every device driver, and even revisions of disk drives and their firmware. You are part of the way there if you select an off-the-shelf server from a tier-1 player, but even they are pretty lax when it comes to testing hard-core Fibre Channel applications. My company operates in both worlds -- we make RAID hardware but we also have products based on commodity servers. From my experience, it's very challenging from an operational standpoint to acquire suitable commodity servers and keep the vendors from changing anything, but luckily we have enough experience to know the right questions to ask and have the right tools to perform enough testing. To give a sense of scale to this issue, more than half the commodity components we test fail to achieve our standards of reliability, and we only test stuff from Tier-1 suppliers.

Excellent point, Bryan -- there is absolutely nothing preventing anyone (including Sun) from assembling the components, testing and delivering them in a highly supported configuration for customers who might want a more traditional approach to how they consume storage.

Ditto for someone doing the same with Windows, or Linux, or any other commodity-software-stack-meets-commodity-server-hardware combination.

I do happen to think ZFS is cool. And I do think there's a potential of a business model to sell it and deliver it in a more traditional way, e.g. as a platform product rather than a set of do-it-yourself components.

I don't know if Sun's up for that, though ...

Thanks for writing!

Hi Gary -- philosphically, I agree with you.

In general though, the constraints you mention are somewhat less demanding for NAS environments (TCP/IP can be a very patient protocol, unlike FC), but the point is still valid.

In my mind, the only potential market is people who (a) are looking for 1 or 2 "nines", and can't justify higher availability, and (b) are willing to do a bit of roll-your-own before, during and after the deployment.

I haven't met a lot of people like that -- have you?

Folks,

This is the same 'ol argument that we have over open source time and again now extended to storage. It boils down to risk. How many people are willing to risk running their mission critical ERP software on Linux, for example? Some, yes, but not many. What most folks do is run their mission critical apps on AIX, HPUX, Windows, etc. where they can get support. Where Linux is making inroads is on other applications that people are willing to take a small risk on in order to save money.

What I question in nearly every case is what are the real savings with open source? Sure, you don't have to pay any license fees for your software, but are you ending up spending more on internal support costs? I've done the math on a couple of occasions, and when you consider the 3 year TCO of open source, you rarely end up coming out ahead.

So I suspect that open source storage will end up the same way. Those who want to save every nickle they can on the front end will go open source and pay the ongoing internal support costs, those who are risk adverse will pay for the relative "safety" of going with a major storage vendor and getting a partner who will help them through the tough times. I think that what we will see is open source storage taking the same slow path into the data center that Linux did and in the end have about the same, or maybe a little less of a share as Linux.

So here's an interesting question, if it's all about cheap storage, what about products like EMC's Hulk which I beieve is nothing more than commodity components bundled together and sold dirt cheap? Does something like that address the those among us who are more concerned with upfront costs?

--joerg

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

If you have a TypeKey or TypePad account, please Sign In

Chuck Hollis


  • Chuck Hollis has been with EMC for 12 years, and is Vice President of Technology Alliances at EMC. He frequently speaks to customer audiences about a variety of technology topics, and can usually be counted on for an interesting point of view. He lives in Holliston, MA with his wife, three kids and two dogs when he's not travelling. Chuck enjoys piano, mountain bking and skiing -- in that order.

General Housekeeping

  • Frequency of Updates
    I try and write something new 1-2 times per week; less if I'm travelling, more if I'm in the office. Hopefully you'll find the frequency about right!
  • Comments and Feedback
    I'm going to be approving comments before they get posted here. Any information you can share about who you are, how to contact you, what you do for a living, etc. would very much be appreciated.