As part of the EMC World festivities, EMC's Isilon group is announcing a few new features available today -- as well as previewing their next release, dubbed Waikiki.
Even with my obvious EMC bias, I can make a strong argument that OneFS is now clearly in a class of its own: architecture, functionality, robustness, performance, efficiency, etc. You could teach an advanced course in file system design and use OneFS as a perfect example.
The gap between OneFS and everything else shows every sign of widening over time. The Isilon team now uses a fast-cadence development model, and we should be expecting regular drops of tick-tock functionality on a 6 month cadence going forward.
Join me for a quick recap of "what's new" in the OneFS world -- there's a lot to like.
The Basics
The name "OneFS" is aptly chosen -- it delivers a single, real-deal scale-out filesystem (up to 20PB and 144 nodes). It is not an aggregation of file systems, nor is it an adaptation of dusty legacy code.
It auto-scales, auto-balances and auto-manages. And does so on largely commodity hardware.
People who have only known the traditional world of NAS filers express more than a little bit of incredulity as they go through a OneFS presentation -- but it's all true, and more.
Now as part of EMC, Isilon technology is cruising in high gear: more resources, more integration and -- of course -- the EMC organization that sells and supports it everywhere.
In a somewhat lukewarm storage landscape, Isilon is a bright spot indeed. 64% year-over-year revenue growth -- now past $1B run rate.
That's not a typo -- the single largest capacity storage transaction in EMC's history (41 PB) was an Isilon deal. No, we can't say who that is.
The last major OneFS release (Mavericks) was quite successful -- Isilon users adopted it quickly, and put it to work.
That's important as anyone in the storage biz knows: it's one thing to build a new feature, it's another thing entirely to get your customers to use it.
And there are plenty of storage industry examples of new software versions that weren't exactly embraced by customers.
Also remarkable -- in a rare moment of agreement, both Gartner and IDC have gone beyond the vague "a leader" statements, but have outright anointed Isilon and OneFS in a class of its own when it comes to scale-out NAS.
All of this couldn't come at a better time:
- data volumes are exploding,
- more organizations are embracing true scale-out architectures to cope with the onslaught,
- and now the first wave of big data applications are finding their way into the data center.
To say there's a lot of momentum here wouldn't be inaccurate.
What's New Today
Two little things -- and a potentially big thing.
I've talked about OneFS's native HDFS capability before, and now it supports both HDFS 1.0 and HDFS 2.0 -- simultaneously, and against the same data sets.
The best way to think about HDFS is more of a protocol than a standard. It tends to be defined by whatever ends up in the Apache distro.
While it's a growingly popular data access mechanism, by modern standards the standard implementation of HDFS is a very simplistic file system with much room for improvement: too many copies of data, no snaps and replication, no tiering or archiving functions, lack of modern administration tools, not multi-protocol, etc.
While the standard distros might be sufficient for those first few pilots and prototypes, more than a few people have started to realize the limitations of standard distro HDFS. They go looking for something better.
OneFS implements HDFS over a real file system, hardens the name node by distributing across the OneFS cluster, and -- most importantly -- doesn't insist that you move terabytes of data into (and out of) your HDFS environment.
Capture on NFS, process using HDFS, publish using CIFS if you'd like. This results in a *huge* improvement in workflow cycle time: no copying of data.
To this day, I haven't seen anyone else doing this.
I've also talked about how Syncplicity (EMC's very cool enterprise sync-n-share) application works with Isilon.
While Syncplicity certainly allows external cloud storage as an option, the vast majority of enterprises want their data in their data center, and Isilon's scale-out properties are turning out to be a perfect fit.
Yes, these sync-n-share environments can get very, very big. Better and better management integration is happening quickly, and that's what's new here.
Do You Need A UFO?
The big feature (available now) is a native REST object interface into OneFS.
The "UFO" reference comes from "unified file and object". Same data, different access methods, common management and administration, resources, etc. -- one file system.
The even bigger idea is being able to change the representation of data without the need of a either a gateway or a separate storage stack.
We've seen this idea surface in the just-announced ViPR as a layered data service over existing storage, and here it is again as an integral part of OneFS.
Having cut my teeth on several object storage implementations (e.g. EMC Centera, Atmos, etc.), I like this approach for so many of the use cases I see in enterprises.
For example, while creating and leveraging object metadata is obviously powerful, not all applications support the model. Customers can use OneFS to support a mix of file-based and object-based applications and move ahead as it makes sense for them.
There are so many use cases where you'd like to get at all the data in your object repository (backup, virus scan, sequential processing, etc.) and throwing a bazillion GET calls probably isn't the best way to do it. Simply access the data as a filesystem when and where you need to.
Lastly, we should talk about performance. Even a modest Isilon cluster can post some pretty impressive specs. Your objects will now perform the same way as files do.
The first object model being supported by OneFS is Swift (part of OpenStack) but I don't think it will be too long before we see support for all the usual suspects.
Love it!
A Waikiki Preview
I think it was about this time last year when we started talking about Mavericks, which was delivered at the end of last year. Consider this a preview of Waikiki, with anticipated delivery before the end of this year.
While there's a long list of new functionality being slated, the team is only teasing a few key features at this time.
The first one is -- yep -- deduplication. Not exactly a surprise, but certainly appreciated.
The OneFS implementation will be post-process (e.g. after the data is stored). It will use an 8K block size, but -- importantly -- it will deduplicate over the entire OneFS cluster.
That means that if a given 8K block is to be found anywhere on your 144 scale-out nodes, it will be deduplicated.
From an administrative perspective, there's the ability to select directories that participate and don't, as well as a "pre-processing estimator" that will help you gauge the impact of deduplication before you turn it on.
Today, OneFS is perhaps the most efficient file system around: the raw-to-usable overheads are usually 80% or better. No, Brand N doesn't want to talk about this, thank you. Throw in deduplication, and you've got even more storage efficiency. As you might expect, actual capacity savings will be highly dependent on your situation, but the team is guesstimating around 30% based on what they've seen so far.
In the minor-but-important category, OneFS Waikiki will support compliance lock-down and full audit control, including emitting events to a variety of external auditing platforms (such as Varonis).
A *lot* of OneFS is now finding a home in financial services and health care settings, and these audit controls are simply table stakes.
Also as part of this release, full integration between OneFS and the forthcoming ViPR software-defined storage product. And -- finally -- a full integration with OpenStack in addition to the Swift API and Cinder adaptors that are already available.
Personally, I'm waiting for the next turn-of-the-crank in cluster size. While today's 20PB and 144 linear scaling nodes isn't exactly "meh", the next logical "qualification horizon" would be 40PB and 288 nodes with today's disk drives. Remember, that would be 40PB as a single self-managing and self-optimizing file system.
There's more coming in Waikiki (of course) but those details will have to wait for the formal announcement :)
OneFS -- A Modern File System For A Big Data World?
As any storage pro will tell you, we live in a big data world. And an awful lot of that data ends up in file systems of one sort or another.
File systems that were designed 10-20 years ago could not have anticipated the world we live in today. Some vendors are clearly trying to adapt old technology to the new world.
It's not pretty.
Others are investing in brand-new filesystem architectures that are designed for the world at hand, and not the way it used to be.
Isilon's OneFS is clearly an example of the latter -- perhaps the best one out there.
Stay tuned for more :)
Hi Janus
I've unpublished your comment (rant?) as it reads very much like it came from a disgruntled competitor.
If you would care to clearly identify yourself and your affiliations, I'll republish it. My house, my rules.
Thanks
-- Chuck
Posted by: Chuck Hollis | May 29, 2013 at 03:16 PM
Hi Chuck, do companies that have requirements around Check 21 and similar image capture functionality fall into the category of a good use case for Isilon taking into consideration compliance and regulatory assumptions?
Posted by: Curious Catherine | September 26, 2013 at 01:45 PM