The advent of lower-cost, higher-performing flash storage is quickly restructuring the familiar storage business.
The advent of flash storage (or SSD) actually impacts two familiar storage technologies.
Compared to spinning disks, it's far faster and thus cheaper per unit of performance delivered.
Compared to server-style DRAM, it's somewhat slower but far less expensive per unit of capacity.
Industry pundits are quite right in asserting that the storage business is changing as a result.
Unfortunately, they often to point at any number of small storage startups as examples of new thinking, and may occasionally dismiss the more established players such as EMC.
Personally, I think we've got an advantage here -- you see, we've seen this movie before.
It's The Mid-1980s
A small company is formed in Natick, Massachussets to take advantage of an interesting opportunity. The minicomputer era is in full swing, and vendors are charging exorbitant prices for their proprietary memory controllers.
Dick Egan and few others create compatible memory controllers using industry-standard DRAM parts. The customer pitch is simple: you give EMC one of their proprietary memory controllers, and we'll give you two of EMC's in return. Business is good as a result.
At the same time, 5.25" SCSI disk technology is starting to mature -- the first 1GB drives are coming to market, and they look to be an attractive alternative to the 8" SMD monsters that are so prevalent.
A bright engineering team, led by Moshe Yanai, comes up with the idea of creating a fully-cached IBM-compatible storage array -- faster and cheaper than anything in the marketplace.
The first Symmetrix is born. In the process, a new buzzword is born -- ICDA, for Integrated Cached Disk Array.
The rest, as they say, is history.
The Secret Sauce Then …
What was the primary technical goodness behind the success of the Symmetrix?
That's debatable, but one element was clearly the use of non-volatile cache to greatly accelerate I/O. Behind that, the true magic came in the form of unique algorithms that were smart at spotting known I/O patterns, and moving the right information back and forth between cache and spinning disks.
I joined EMC at the end of 1994, just as they were bringing their mainframe storage magic to the then-nascent UNIX and open systems world. During 1995 and 1996, we'd go into large UNIX shops, have them plug in one of our cached disk arrays, and then routinely watch as their jaws dropped at the level of performance they'd see.
In essence, we were using smart software to capitalize on two important storage transitions; one in memory (DRAM), the other in disks (5.25" SCSI disk drives). Understanding how storage caching algorithms *really* work isn't for the faint of heart; the devil is in understanding the details -- and lots of them.
Smart Storage Caching Is Harder Than It Might Look
Any sort of storage cache is a comparatively expensive and scarce resource compared to the storage it caches. Use it wisely, and you come out ahead on what you paid for it. Use it poorly, and you've simply thrown hardware at a problem without seeing any real benefit.
If you aren't familiar with the basics of storage caching, I wrote a simple primer a while back; it's still largely relevant to this day.
Of all the forms of storage caching, read caching is perhaps the easiest for vendors to master, simply because you don't have to deal with messy details like guaranteeing data is written and is recoverable, even if there's a failure.
Indeed, you'll find no shortage of flash-based "storage accelerators" out there that are basically large read caches either in the server, in the storage network, or in the storage device itself. Most of them use pretty simplistic algorithms to keep frequently used blocks in cache (e.g. LRU, or last recently used) once they've been read for the first time.
A handful are smart enough to pre-fetch data, or attempt to do read-aheads on sequential accesses. Most of them, though, are simply big dumb read caches -- the same sorts of things we've had on big UNIX servers for years.
None of read cache approaches attempt to handle written data, because it's -- well -- harder.
That's fine if you're never changing data; less optimal if you're capturing changed data you'd like to preserve for future use. Just to be clear, there are plenty of narrow use cases where all you care about is read speed, and data isn't changing enough to be a concern. And I can make an argument that read-only caching can be of value in those situations.
For the rest of us, though, life isn't so simple.
Most applications are mixes of reads and writes. Individual application I/O profiles can change rapidly and unpredictably.
And we'd like to have a standardized storage infrastructure that does a decent job on *all* I/O profiles, and not just a few hand-crafted and well-understood exceptions.
So how do we take the advent of ever-improving flash storage technology, and apply it to this world?
… And The Secret Sauce Now
As I mentioned before, the new generation of flash storage can replace both disks and memory.
Take FAST VP as found in both the VMAX and VNX. Smart software automatically moves popular data to speedy flash drives, and less frequently used data to larger "data tub" disks. The exact mechanics of how each platform does this varies; but the basic idea is the same.
A little bit of flash storage used as a replacement for spinning disk (plus some smart software) yields compelling performance and cost-savings differentials.
Just to be clear, the flash storage used in this way is protected in much the same way disks are, so it can safely be used for writes in addition to reads.
Now take flash as a replacement (or, more correctly, an augmentation) for the server-style DRAM used as caches in most storage arrays.
The VNX FAST Cache is an excellent example -- the same storage media used as a replacement for disk can be configured as "storage cache augmentation" for the array itself -- not quite as fast as the DRAM it augments, but a whole lot bigger and a whole lot cheaper.
Like the non-volatile DRAM-based storage cache it augments, the flash here is protected, and thus can be safely used for both reads and writes -- a big difference over the read-only caches that are much easier to implement.
Once again, a bit of flash storage technology (plus some really smart software) means huge performance improvements and cost savings over other alternatives.
Now let's consider the server-side flash storage that's starting to augment server DRAM in performance-sensitive environments. Although I'm sure some of the vendors will debate the point, my perspective it's volatile cache.
Write to it, and there's no guarantee it'll be recoverable if, say, your server fries -- unless you take special measures to make sure it's simultaneously written to two distinct locations.
Server-side flash caches can be made smarter through software. The server-side cache can be pre-fed data that's known from past experience to be popular, and -- in some cases -- be pre-fetched during sequential reads, going beyond the simple LRU (last recently used) algorithms that are the norm here.
Where software really can help is handling writes. Not all writes need to be preserved outside of the server enclosure, but -- for those that do -- smart software can pass the I/O to an external storage array (hopefully with it's own flash!) much in the way servers handle I/O writes today.
But there's more. Imagine dozens of servers with hundreds of virtual machines, all with shifting I/O profiles. Imagine the servers have flash storage cards, and assume the array has both flash storage as a cache, and flash storage as a disk replacement.
At any given time, what's the right I/O optimization policy across all of these activities? How do you get the best bang-for-the-buck across the flash in the servers, the flash being used as array cache, and the flash being used as persistent storage? And how do you do that automatically and in a way that's simple to understand and control from an administrative perspective?
You need some very smart software indeed. And that, in a nutshell, is the secret sauce behind Project Lightning.
At this week's Oracle Open World, Pat Gelsinger announced that EMC had started shipping betas of the new environment to a handful of customers. He held up a 320 GB flash card.
Maybe he should have held up a DVD with the software that makes it all magical :)
Redefining Storage Boundaries
In one sense, this effort is about redefining the traditional boundaries between storage and server. In a world of server-based storage flash, the managed and optimized storage domain *has* to extend into the server environment to be effective.
The two storage domains ideally would work together as an continuum, and not in isolation.
Conversely, I believe that the vendors of PCIe flash products will find this to be true, but from a different perspective - the storage domain *has* to extend from the server to external devices and storage arrays to be truly effective. Going back in history, there was a time when most storage was captive inside the server. And now we live in an era of external storage arrays.
Indeed, it probably won't be long before we hear about more partnerships forming between the server-side flash storage folks and the more traditional array vendors.
I also think it's safe to say that any such partnership is unlikely to deliver the same benefits of a purpose-built and fully-integrated end-to-end solution from a single vendor. We've seen this before; we'll likely see it again.
Reversing The Flow
Things get more interesting when you consider going the other direction. If you think about it, the discussion up to now has been about getting important data closer to the server through various forms of intelligent caching, both volatile and non-volatile.
But what about moving the compute workload closer to the storage? If a task is data-intensive but doesn't require a ton of compute (thinking of certain aspects of big-data processing as an example), wouldn't it make sense to move the task closer to the storage, vs. trying to move terabytes of data across a wire?
That's exactly what we've been showing, first at EMC World, and now again at Oracle Open World. The target here is the Isilon scale-out NAS device. We now can show virtual machines "VMotioning" to the array, and back again.
Is your application compute-intensive, but doesn't consume a lot of data? Intelligently move the data closer to the server as needed. Is your application data-intensive, but doesn't consume a lot of compute? Move it to the array as needed, and back again.
And do so automatically in very large environments, and with the visibility and control that administrators will need.
Blink once, and you end up with very different infrastructure landscape than what we have today: servers taking on the properties of storage arrays, and storage arrays taking on the properties of servers.
Thankfully, they're all made of essentially the same components :)
Packaging Simplicity?
Anyone who's spend time with an Isilon array realizes that its approach completely redefines what it means to be simple at scale.
As your environment grows, you simple rack up more modules -- they auto-configure, auto-balance and auto-protect.
I found it fun to watch it do its storage magic the first few times (sort of like watching a disk defragmenter program), but you quickly move on to more interesting things.
It just works, case closed.
Just the thing for scale-out NAS.
But what about scale-out compute, especially if we're considering lots of smaller VMs?
It's not hard to imagine the same model being applied to virtual machines *and* storage using the exact same scale-out paradigm.
Combined storage and server capacity comes in small bricks built of familiar commodity technologies; and through the magic of smart software, you simply rack up the ones you need, and everything auto-configures and auto-balances.
Not for every IT environment, but potentially useful for many I'd think.
Where Does That Leave Us?
I think the IT pundits are quite correct in asserting that there are powerful waves of change and disruption percolating through our entire IT industry -- from cloud to big data to even smaller domains such as storage, shifts are happening.
And, like the IT pundits, we at EMC keep a close eye on all the small startup companies that have latched onto one idea or another. Trust me, we're just as interested in them as anyone would be; perhaps even more so.
But what I think many people might miss is that larger companies such as EMC can often accelerate and capitalize these fundamental shifts -- if they have the culture, motivation and leadership to do so.
Especially if we've seen the movie before :)
Chuck, Great article... you pretty much describe Nutanix to the tee in your last few paragraphs!
Posted by: Rick Westrate | October 06, 2011 at 10:59 PM
Violin Memory is someone who comes to mind with a redundant array (all flash), smart software and cost per performance/capacity that competes with spindle based array's on price and reliability. Great article BTW
Posted by: Broke_Da Mouth | October 12, 2011 at 12:31 AM