Some of you blog readers enjoy a good dust-up between industry bloggers. Well, so do I.
When Oracle announced they were getting into the storage biz with their Exadata storage server, I wrote a post expressing my skepticism. So did many others as well.
And now, as a result, it looks like Kevin and I might be exchanging words in the near future :-)
What's This All About?
Well, it's about Oracle getting into the storage business, for one thing. And perhaps a continual problem they've been having with DW performance: losing market share to more specialized players.
Rather than focusing solely on software, they've got the route of offering a pre-configured behemoth from HP running (according to Kevin) a customized version of Oracle you can't run anywhere else, making like-for-like comparisons somewhat difficult.
In his last post, it's pretty obvious that -- based on his condescending and somewhat nasty tone -- that I got under his skin a bit. I've usually found that the vitriol in a response is in direct correlation to the sensitivity of the issue I've raised.
So it must have been a very sensitive issue indeed.
Rather than back off, I now (somewhat strangely) feel compelled to bore in a bit, and explore this in more depth.
Come, join me on a an interesting journey here ...
So, Let's Dig In
My first general argument was -- what does this particular hardware bring to the picture, other than a cosy marketing relationship?
From a purely hardware perspective, we've got "storage nodes" (ostensibly generic HP x64 server kit with a $449 P400 RAID controller) running in a 12 SAS disks to 1 server ratio. The disk is mirrored, no support of any space-saving RAID options -- strange, for such a large machine.
The "storage nodes" are interconnected to "database nodes" via Infiniband, and I questioned (based on our work in this environment) whether this was actually a bottleneck being addressed, or whether it was a bit of marketing flash in a world where multiple 1Gb ethernet ports seem to do as well.
Kevin seemed to take issue with my characterization of the storage subsystem as JBOD, and not "smart". He's right about that, technically speaking that would make it a DAS (direct attached storage) configuration, as opposed to SAN, NAS or other topologies.
However, I don't think too many storage people would look at a $449 SAS RAID controller with 512MB of RAM and an "optional battery backup unit for cache" as excessively "smart".
Kevin seemed to agree (it's not clear, though) that the Infiniband didn't bring much to the party. We'll leave that one open for now, pending further clarification by Kevin.
And, with regards to disk hardware, he didn't try to justify the RAID 1 on either performance or availability grounds (debatable, though), but did seem to state that I had a certain lack of imagination as to what might be possible in the future.
Being a storage guy, I know that the real issue isn't the disk, it's being able to get to your data if one of the "storage nodes" fails. And since this type of architecture doesn't know how to share storage, you're forced with putting all of your data in two places, in case one node fails. Unlikely that we'll see something more space efficient in the forseeable future.
And Then There's The Open Software Question
Not "open" in the sense of open source code et. al., but open in the sense of "I can run this software on any reasonable choice of server and storage".
Kevin is pretty clear that this particular version of Oracle is available in one place and one place only -- the hardware that Oracle sells.
Now, we could debate the pros and cons of this (as I'm sure will be debated in the future), but it's a clear departure from past Oracle "runs on anything" strategy.
And you just have to ask yourself the question -- why is this?
I'm guessing that all Oracle sees is a version of Linux. Probably not Oracle's version, since I understand that HP has its own versions that it prefers, but I could be incorrect on this. It doesn't see the "smart" RAID controller. It doesn't see the Infiniband, that's abstracted as well.
So it appears to be a "business choice", rather than a technical requirement.
From a purely customer perspective, it makes it hard to see how much value comes from the hardware, and how much comes from the software. We'll never see side-by-side comparisons of this particular software running on potentially faster/cheaper/better servers and storage, will we?
Alternative Approaches To Scale-Out
Scaling out a DW environment horizontally is nothing really new, not even for Oracle environments. Indeed, EMC and Oracle (along with Dell) have done scale-outs with moderate-sized arrays (not big honkin' ones as Kevin suggests), moderate processors and standard-grade 1Gb ethernet connections.
We get pretty good cost-effective DW performance this way, not only with Oracle, but with DATAllegro, Vertica, SQLserver, UDB and a bunch of others. And, taking this approach, there are great answers for things like backup, business continuity, security, storage management and every other joy that comes along with having dozens of terabytes of important data in a DW/BI environment.
From what we can tell about Oracle's standard pricing (exclusive of the steep discounts they're currently offering to get people to try this stuff), it looks like a very, very expensive solution by comparison. [Warning: the power that these machines consume is not free ... if I get a moment, I'll get someone to run a power usage comparison. I'm guessing it'll be eye-opening, given what they're doing on the hardware architecture]
If it ran faster than other alternatives, at least we'd have a basis for comparison.
But we're not going to get that anytime soon, are we?
And, An Apology, Sort Of
Thoughout his post, Kevin takes me to task for not researching white papers, his previous posts, etc. and thus came to some incorrect conclusions, particularly in regards to the nature of the software that Oracle is promoting as part of this bundle.
Sorry, Kevin, I could have done a bit more homework in this regard -- thanks for clarifying.
This Should Be Interesting
The real focus here should be software, not hardware.
As an example, HP appears to be selling the same kit with other environments, including their own. And we see plenty of aggressive startups delivering stellar performance and functionality while still, somehow, allowing customers to separate the hardware and software choice.
Well, if Oracle wants into the hardware business, they're going to have to earn it.
And I'd suggest the best way to do this would be to have the hardware stand on its own two feet.
Courteous comments welcome, as always!
Hello Chuck,
Is EMC planning to incorporate some of DWH processing on the Clariion itself, ie something closer to a DWH appliance/node?
regards
sudhir.brahma@gmail.com
Posted by: Sudhir Brahma | October 31, 2008 at 11:46 AM
Sudhir, you ask very good questions!
The answer is that we've looked at this on and off for many years at EMC. So far, we haven't been convinced that this would be something that people would want.
One aspect is the observation that the FC channel isn't the bottleneck in DW. We can, with a bit of forethought, feed more data into a DW server than it can reasonably process. Moving some of that logic from one side of the FC channel to the other doesn't buy any performance or cost advantage in most cases.
Another aspect is the propensity for DW to consume large amounts of CPU and memory resources respectively. So, if you built a storage array optimized for the task, you'd end up with some very interesting-looking storage arrays that had big-server specs rather than storage specs.
Since you seem to follow this blog, you're probably aware that I and many others are incredibly skeptical of Oracle's move into this space with their HP-built Exadata server.
There's no evidence that it's any faster than a traditional approach (e.g. servers and SAN), not to mention obviously more expensive, questionable manageability, and so on.
Now, that's the story so far. But, as you know, things have a way of changing in this industry over time, so never say never!
-- Chuck
Posted by: Chuck Hollis | October 31, 2008 at 04:55 PM
Sudhir/Chuck,
PADB (from ParAccel) already has a solution on Clarionfor extreme performance. PADB is a MPP Database built for DW and analytic database. Extreme performance on Clarions. www.paraccel.com
Good luck
Posted by: DW Appliance | November 07, 2008 at 08:14 AM
Just wanted to give the reference to the Clarion appliance. Its called SAA
http://www.paraccel.com/pages/solutions/scalableanalyticappliance.php
BTW, Chuck makes excellent points about Oracle Exadata.At the end of the day its nothing but a RAC. Even though the storage sends less data to the hosts, the SMP hosts are still a bottleneck. Its not a MPP shared nothing architecture either. All the Redo logs etc are still there and shared. Indexes still need to tuned. So, overall it has nothing different to offer than traditional Oracle.
Also, what is a 10X improvement claim over existing Oracle systems? Nothing to boost about when companies like ParAccel with no indexes and no tuning can be extremely fast. PADB is a simple load and go environment, no tuning and no indexes. Check it out!!
Posted by: Anonymous | November 07, 2008 at 09:44 AM
So while Oracle turns to appliance marketing(my metaphore here is "You want a steering wheel with that car?"), Teradata is now running on Linux although I don't think they're marketing it heavily. Teradata does use the streeing wheel approach for backups and now a new appliance for database management. The latter is really bad as dbms management is obviously software only, why do I need your PC to run it on??
This however, is miniscule compared to requiring hardware for the DBMS. Why did Teradata abandon that idea? Is it inevitable that Oracle will as well and if so, after how long?
Posted by: Kevin | December 16, 2008 at 12:43 PM
I agree, Kevin
I think the real challenge for Teradata and similar is the business model: they've built their business on selling, supporting and controlling all aspects of the solution, and the premise of unbundling their value-add scares the cr*p out of them.
But Oracle is different -- I think it is a toxic mixture of a well-known problem (e.g. Oracle performance in DW environments), new and nimble competitors (insert long list here) along with a healthy dose of typical Oracle behaviors.
Time will tell. So far, not exactly a big uptake in the market :-)
Posted by: Chuck Hollis | December 16, 2008 at 03:43 PM
"So far, not exactly a big uptake in the market"
Where do you get the sales figures for Exadata? Just interested!
I have a feeling that Exadata is not going to sell big time, but am always willing to be proven wrong.
Posted by: Richard | February 11, 2009 at 08:52 AM
Hi Richard -- it's strictly anecdotal so far.
These tend to get proposed in large enterprise accounts where EMC typically has good coverage, so we're guessing we hear about the vast majority of these.
So far, we have yet to confirm that one has actually been paid for yet.
-- Chuck
Posted by: Chuck Hollis | February 11, 2009 at 09:47 AM
This Exadata is for DW/BI environments only? Or will it also work with regular OLTP environments?
Posted by: okwui agada | April 02, 2009 at 08:08 AM