« I Love A Good Disruption | Main | The New Face Of IT Security Infrastructure? »

August 27, 2009

Comments

kostadis roussos

Chuck,

Good post.

The problem I have with your thesis is the definition of object store vs file system.

It seems you're saying that if the data is not organized in a hierarchical name space, is accessible via web based protocols through a unique identifier then it must be an object store, and has extensible searchable meta data it must be an object store.

You then make the interesting leap that this represents the future of data organization.

So I have to step in and caution your enthusiasm. And let's be clear it's not because I happen to work at a company that sells NAS devices.

The reason the file system metaphor has been so durable is that the organization of the meta-data in a search-able format is a hard problem.

If you need to store billions of objects, and you need to search them, then you need an index. And it turns out that the directory tree happens to be the most efficient mechanism for representing the search-able index. This isn't that surprising. A directory tree allows you to partition the objects you're search for into smaller manageable chunks than can in turn be searched. In the last 40+ years of computing there has yet to be a better way to do things.

Ultimately any object service will need some kind of name lookup service, and that name lookup service to scale to billions of objects will rely on some kind of hierarchical scheme which will end up looking like a directory.

I'll concede that the minute I said X was impossible, 2 minutes later there was someone who demonstrated that it wasn't, but ... I am fairly confident that hierarchical information organization is still necessary.

The only remaining question is whether the namelookup is necessary, whether you can directly access the object without doing a "READDIR". I suspect the answer is going to be yes. Even today, an NFS client does not repeatedly do a LOOKUP once the filehandle is acquired. Instead directly accessing the underlying object (file) directly through a globally unique id.

If the directory as an organization format remains, durable, then what we're arguing is over whether the interface of the future is "open", "read", "write", "close" and whether the meta data is stored within the file system or a bag on the side.

I suspect that file systems will evolve to support extended metadata within the file system, and that adding additional interfaces to manipulate data will also emerge. I expect that in 10 years the NAS device of the future will support protocols that are web based interfaces.

So if I do agree on one point it's that the file system of the future will have additional capabilities, will be accessible using different protocols, but that the central principles of a hierarchical namespace, with the ability to read and write into the namespace, will remain.

cheers,
kostadis

Chuck Hollis

@kostadis

I think you missed the point. It may be the case that hierachichal file systems may search faster in some situations. And we both can think of situations where that would not be the case as well.

Separating object from presentation (or search method, in your comment) still has unique merits. Object identifiers can be organized in traditional hierarchy if needed, a relational database if needed, or any other schema without impinging on object properties.

I disagree with your assertion that "ultimately any object service will need some kind of name lookup service, and that name lookup service to scale to billions of objects will rely on some kind of hierarchical scheme which will end up looking like a directory."

Categorically not true -- many successful counterexamples exist in the industry today.

I'm glad you somewhat agree with my point around metadata. It does not belong in a bag on the side.

Why do we have to wait 10 years for NAS devices to catch up with what's already available and proven in the market? That doesn't sound right, does it?

Thanks for writing ...

-- Chuck

Chuck Hollis

This is from Chuck

I've been watching Val over at NetApp blustering on about clouds, and I've resisted the temptation to "help educate" him on what is cloud and what is not-cloud.

Given that his motivations seem to be marketing-oriented rather than having an intelligent discussion, I'll leave him to his new role in marketing.

However, I do believe that it would be useful to offer up the observation that, in many cloud models, traditional filesystems strike many of us as woefully inadequate for the richer semantics of a cloud model.

So, maybe the title of the original post should have been "The Cloud Doesn't Have A File System".

Thoughts?

-- Chuck

kostadis roussos

@chuck

We'll just have to agree to disagree.

kostadis

Chuck Hollis

@kostadis

Opinions and long-held views take a long time to change.

I just had an ugly flashback to my first "real" programming job (in high school!) where I was schlepping COBOL code at the time.

My wizened supervisor insisted that the optimal way to represent all forms of information was 80 characters of EBCDIC as it was the "most efficient".

I went a few rounds with him at the time, and we agreed to disagree -- as long as I did it his way.

Glad that things have changed!

-- Chuck

kostadis roussos

@chuck

Hmm...

I am being compared to a dinosaur... Someone who is stuck in the mud ... Lacking any foresite ...

Hmm...

cheers,
kostadis

Chuck Hollis

Kostadis -- no -- you're the guy who's thinking out-of-the-box, remember?

-- Chuck

Alex McDonald

As the "competitive blogger over at NetApp" I can safely say; my vested interest is in working out what you mean. Not disagreeing with you, or agreeing with you, to suit some agenda I may have; just cutting through the medium and getting to the message.

I'm struck by the fact that you (and Paul Carpentier from what I can see) are interested in throwing away what you see as excess baggage from a hierarchical name & address space view of the world. Your assertion that

[quote]There's also the fact that information objects will usually want to move their physical location during their lifetime -- not an architectural problem for object-based information stores.
[/quote]

and your reply to Kostadis

[quote]
I disagree with your assertion that "ultimately any object service will need some kind of name lookup service, and that name lookup service to scale to billions of objects will rely on some kind of hierarchical scheme which will end up looking like a directory."
[/quote]

needs an example, because I'm struggling -- really struggling -- to come up with the counterexamples that you claim exist in the industry. Or how you might do this without the aid of DNS, a hierarchical namespace if there ever was one...

Glad, btw, that you eventually got round to mentioning REST, and that your conversion to the notion that real LUNs aren't necessarily better than fake ones. We'll make a real virtualised storage convert and WAFL fan out of you yet!

Paul Carpentier

@Chuck:

'So, maybe the title of the original post should have been "The Cloud Doesn't Have A File System".'?

Well, maybe it should have been "The File System Doesn't Have a Clue"! as testified by untold junkyards of hierarchical pathnames pathetically (pun intended) trying to impersonate real metadata. ;-)

1. Hierarchical file systems will always get in the way of robust, automated, massively parallel scaling of storage

2. Hierarchical file systems are unable to provide the fundamentally unique references to immutable content or metadata capabilities required for long term storage

3. Hierarchical file systems are almost always more a hurdle than a help in modern application architecture based on RDBMSes where full path names need to be maintained rather than permanent unique identifiers that need to be stored once, at creation time.

4. Virtual, "projected" hierarchical views on top of object storage for legacy applications and some user populations may make sense in the short and medium term. Eventually, apps will go "object native" because it's simpler, more robust and much more scalable at lower cost, while end users are largely adopting objects (attachments) found through timeline and metadata (messages) search as their daily storage vehicle: the information contained in email systems like Outlook, GMail, Hotmail is already more important than the hierarchical folders on the desktop.

So yes Alex, you have read correctly that I am interested in throwing away excess baggage; hell, I just **love** to do exactly that. There simply is no role for infrastructure based on hierarchical file systems in the ubiquitous, always-on, connected world we're quickly becoming. They've made the lives of too many IT operatives miserable, they've overstayed their welcome, it's really time for them to go.

Ten years ago I mostly generated blank stares when preaching this. Now, I'm really glad that there are more people every single day that see the light and share this opinion. In the interest of a soundly competitive market, I sincerely hope that NetApp will join their ranks in time to make a difference.

The comments to this entry are closed.

Chuck Hollis


  • Chuck Hollis
    SVP, Oracle Converged Infrastructure Systems
    @chuckhollis

    Chuck now works for Oracle, and is now deeply embroiled in IT infrastructure.

    Previously, he was with VMware for 2 years, and EMC for 18 years before that, most of them great.

    He enjoys speaking to customer and industry audiences about a variety of technology topics, and -- of course -- enjoys blogging.

    Chuck lives in Vero Beach, FL with his wife and four dogs when he's not traveling. In his spare time, Chuck is working on his second career as an aging rock musician.

    Warning: do not ever buy him a drink when there is a piano nearby.

    Note: these are my personal views, and aren't reviewed or approved by my employer.
Enter your Email:
Preview | Powered by FeedBlitz

General Housekeeping

  • Frequency of Updates
    I try and write something new 1-2 times per week; less if I'm travelling, more if I'm in the office. Hopefully you'll find the frequency about right!
  • Comments and Feedback
    All courteous comments welcome. TypePad occasionally puts comments into the spam folder, but I'll fish them out. Thanks!