Well, it's been discussed for a while, but the news is out in the open: Oracle will be selling a data warehousing "appliance" (server and storage) built by HP.
In typical understated Oracle fashion, they bill this as "The World's Fastest Database Machine".
But, all posturing aside, it's an interesting example of the different trends going through the industry right now.
If you take a closer look at the hardware, it's pretty cool.
It looks like "storage nodes" (a server and DAS), lashed up to a larger database engine using Infiniband. The idea is that Oracle breaks a large query into smaller chunks, pushes them down to the individual nodes, and the results are aggregated at a higher level.
Performance scaling is achieved by adding more storage nodes, and (presumably) breaking queries into progressively more equal-sized pieces.
Of course, there's little in the way of performance comparisons to help us evaluate just how fast this beast might go, except the "Up To 10x Faster" which smells a bit optimistic, never mind that it's Oracle comparing with itself, rather than other data warehousing appliances.
From a pure storage perspective, not much to talk about here other than their use of disk mirroring throughout, if I'm reading the data sheet correctly.
Ummm .. that can get really expensive folks, especially in larger configurations. We're not only talking disks and power here, we're talking nice HP server-based storage nodes to put them in, which don't look especially cheap to me. If it were me, I'd want a RAID 5 (or 6) option ...
I think the plan is that Oracle sales reps will sell this stuff, and HP will do the installation, support etc. That's certainly an interesting model. It's not one that I'd attempt, but -- hey -- maybe they'll find a way to make it work.
I still have questions about backup, remote replication, etc. since many of these DWs are becoming business-critical systems, but I'm sure there are answers for all of this.
So, Here's What's Really Happening
As I understand it, there's a consensus in the industry that Oracle is under serious pressure from purpose-built tools in the DW/BI marketplace.
Whether it's traditional players like Teradata and Netezza, or newer players like DATAllegro (acquired by Microsoft), Vertica or ParAccell -- these specialized platforms can come in and make Oracle look very bad indeed from a pure price/performance perspective.
I think Oracle had to do something about it, and do it quick.
One option would be to acquire and/or build software optimized for DW/BI. From what I've heard, these specialized players can do some amazing things.
Another option would be to work with a server vendor to build some Really Fast Hardware to run a version of their existing code.
Without hard comparisons, though, I have to say I'm skeptical. First, Oracle has always been able to break queries into smaller pieces, push them down to other servers, and aggregate the results.
Nothing really new here.
Second, array-based storage technology is not the bottleneck; our work with Oracle and other DW/BI environments routinely shows that we can feed data to a server just as fast as it can take it.
"Queries run closer to the data" ??? Sounds like some creative marketing here. From a purely technical perspective, there's nothing really new here either.
Maybe they picked up some benefit from the server-to-server connection being Infiniband rather than something like RDMA over ethernet, but I'm dubious about that as well. You'd have to show me that the ~200MB/sec bandwidth offered by a pair of GigE connections was slowing things down. Remember, this traffic is ostensibly reduced query results, not raw data.
Now, to be fair, you can pick up some serious performance benefits by hitting the "sweet spot" of balancing CPU and I/O -- something we've been doing with Oracle and Dell for a while, using the CLARiiON product family, but I don't think the claim is "10x" or similar.
And Then There's The Anti-Appliance Argument
There seems to be a line of tension between business users of DW/BI, and the data center infrastructure people over the whole question of appliances.
Of course, the DW/BI users would like something optimized for the job at hand. But the data center infrastructure people try to limit diversity as much as possible: consistent choices for servers, operating systems, network, storage, backup, replication, etc.
If this behemoth is going into an existing HP/Oracle shop, that's one thing. But if it's going into a shop that's trying to standardize on other choices, well -- that's another thing.
And, Of Course, I'm A Bit Cynical ...
Every year at Oracle Open World, we hear about many "new initiatives" from Oracle. Well, not to be harsh here, but it's my impression that very few of them get talked about at next year's Oracle Open World. I routinely dig up past announcements from previous years, and it's relatively consistent pattern. I think it's fair to ask the question -- just how serious is Oracle about all of this?
And, if they really want to solve a performance problem, we've got the answer .... :-)
Seriously, though, I've seen more than a few environments where the performance problem wasn't so much getting the data off of disk, it was the I/O storm generated by the subsequent data reduction and analysis. And it looks like a small amount of EFD might be very helpful in some cases.
I think we'll have to wait a while to see if Oracle (and HP) is going to be successful with any of these.
Time will tell.
[Update on Oct 20 -- I've been responded to! Go see the latest here]
Courteous comments welcome as always ...