I think almost every storage professional saw the news around the proposed FCoE standard.
If you missed it, simply google "FCoE ethernet" and there'll be plenty of reading for you. Or read this.
What's going on here?
And why does this proposed standard offer hopes for the stalled iSCSI marketplace?
Let's take a look ...
The Ups and Downs of iSCSI
Most storage people have seen the potential benefits of using Ethernet infrastructure for storage connectivity. It's not a new discussion.
At a macro level, ethernet components have an order-of-magnitude advantage in terms of economic scale as compared to FC components. As an example, think how much you pay for a FC host bust adaptor, vs. a decent ethernet NIC. Or a switch port.
No comparison.
Going a bit farther out, there's the hope of common infrastructure, common management and common services on some sort of converged data center fabric.
Well, that's the hope anyway.
And the industry's first try -- iSCSI -- only achieved partial success. The good news is that everyone ended up supporting it (including a few OS heavyweights like Microsoft who actually drove adoption).
And iSCSI has found a nice market home in new, smaller SANs where no FC is present.
Yet, at the same time, compared to the aggregate storage market, it's still a very small fragment. Proponents pointed to fast iSCSI growth rates, but -- according to IDC -- that's slowed as well.
And if you work in large enterprises that have made investments in FC infrastructure, you've probably noticed that they're just not interested in talking about iSCSI -- period.
I've written about the ups and downs of iSCSI before -- and gotten flamed in the process.
So, is FCoE going to be better than iSCSI? Will it enjoy broader success?
Too soon to tell, but the potential is tantalizing.
The Problem with TCP/IP (for storage guys, anyway)
Now, I'm not a deep technologist, I just pretend from time to time, but let me wade in here.
One core issue with iSCSI appears to be that it uses IP transport protocols.
Yes, this means that traffic is routable, and IP network guys can set up SANs and the IP ecosystem generally works.
But it also inherits some of the problems of the IP stack.
As an example, FC is lossless, and offers near-guaranteed latency. No such guarantees with IP stacks. If you're a big shop with performance and predictability concerns, that's an issue.
iSCSI manages like an IP network. Which means it doesn't manage like a FC SAN. Which means if you build and manage FC SANs, that's an issue for you.
And there are significant parts of the extended FC SAN feature set that never really made it into the iSCSI world. Generally speaking, these are the features that larger enterprise customers want.
And I don't want to wade into performance debates, but adding an IP protocol layer that the FC use case sees as mostly useless overhead is an issue as well.
Put it all together, and enterprise storage guys had plenty of reasons to pass on iSCSI in larger shops.
Enter FCoE
Fibre Channel over Ethernet -- in its simplest form -- tries to rectify some of these concerns.
First, there's a direct one-to-one mapping of FC frame to Ethernet frame. No IP protocols. No extra stuff. As close to a bare-metal protocol as you're likely to see. This means fast and simple.
Second, there are extended mechanisms (as part of the proposed standard) to guarantee latency without retry -- that's the "pause" mechanism you read about. FC does that, FCoE mimics this behavior.
Third, the design point is that it is designed to look, smell, behave, manage, etc. as an FC network that just happens to use Ethernet as its base protocol. It doesn't try to get too fancy.
There's more stuff in there that I thought I saw (congestion management, creating "hard" subchannels for different kinds of traffic, and so on) which all looks cool, but is more like frosting on the cake [more on that in a subsequent post]
But at its simplest level, it's a straightforward attempt to bring the economics of Ethernet hardware (in this case 10gE) to the enormous world of FC SANs by not trying to do too much.
Simple is good.
What Does This Mean?
So, a bit of a reality check.
- There are no FCoE products in the marketplace today.
- 10gE hardware is still pretty expensive.
- And even if something existed today, there'd be a predictably long slog to get customers to evaluate it and then implement it.
So nothing is going to happen tomorrow, or the next day.
But I bet that there are lots of vendors taking a hard look at FCoE. I know that EMC is.
It's safe to bet that 10gE hardware will come down in price, as did 1Gb before that, and 100baseT did before that. I'm still waiting for the first motherboard that has 10gE onboard, rather than add-on.
Such is the way of things.
And if the price gets low enough, and the performance is there, and the vendor ecosystem is there, and if it's simply an extended variation of what people already know today (FC) in terms of behavior and management -- well, I would offer that it has a helluva better chance of showing significant enterprise adoption as compared to its predecessor, iSCSI.
At least, from where I sit.
Now, before I get flamed by the iSCSI Fanboy Club (again!), let's balance out the discussion a bit.
iSCSI does very well in certain market segments. That's not going to change. Especially where your storage needs are not advanced enough to warrant a separate FC infrastructure and all that it entails.
And there are lots and lots of people who fall into that category. So, if you're a vendor, and your business model is based on iSCSI, there's still lots of people to sell to.
And there's no reason whatsoever that I can see why both protocols (iSCSI and FCoE) couldn't live on the same physical Ethernet infrastructure, which means that some people might look at FCoE as something they might consider if for some reason iSCSI doesn't meet their needs.
So, I'm not seeing this as a winner-takes-all competitive battle between two protocols, although I'm sure I'm going to read at least one article along those line. And there be some who feel threatened by this new standard.
Ultimately, in the FC vs. Ethernet discussion, many of suspect what the end of the movie should look like.
Ethernet wins.
Is this the way forward?
FCoE is very much like AoE (ATA over Ethernet) for more info see: http://en.wikipedia.org/wiki/ATA-over-Ethernet
What's more AoE initiators and targets have been baked into Linux kernel for some time now and drivers are also available for Windows and Mac OS/X.
Like FCoE, AoE operates at the MAC level, so there is no IP routing overhead and in fact by using seperate ethernet switches AoE may be used as a poor man's SAN.
Yet, market take up has been very slow. Seems that until the big players offer certified implementations of Aoe or FCoE, the take-up will be slow.
With link-bonding, we should be able to build Infini-band like scalable throughput. So there's no need to wait for 10G ethernet.
Posted by: Fred-san | April 11, 2007 at 09:06 AM
Excellent angle, Fred-san, that I was aware of, but did not include.
Yes, AoE is part of the spectrum. And, from what I've seen, it has some interesting use cases, not the least of which is addressing my growing sprawl of servers and storage at home.
I would offer that I can see it working across all three spectrums: AoE as entry-level storage networking, iSCSI as mid-range storage networking, and (hopefully) FCoE for the enterprise crowd.
In this scenario, even though we've got different protocols (and storage mgmt models), ethernet wins, which was the rationale for all of this in the first place!
Thanks for reading, and thanks for commenting!
Posted by: Chuck Hollis | April 12, 2007 at 01:24 PM
There are so many things to contest here, but I'll focus on one of them for now. The last time I checked (which was a few years ago) almost all FC SAN traffic between systems and disk subsysems was Class 3 - datagram service - having no latency guarantees whatsoever. Flow control for FC is not even close to what is built into the TCP layer of the Ethernet/TCP/IP stack. The fact is, there are very few people around who actually know how FC works who can describe its real-world operations and pathology in detail.
Instead, FC proponents tend to make vague, hand-waving arguments about network latencies without actually explaining what that means - or more importantly what it does not mean. The fact is, network latencies in FC SANs are dwarfed by the latencies introduced by the end nodes - which for SANs turns out to be the speed of the storage controllers most of the time. That's why FC switch vendors stopped making a big deal about the microsecond frame forwarding performance of their switches - the switch latency numbers don't really matter.
That brings us to the the performance of network nodes. iSCSI does use TCP/IP which does take more host CPU cycles to process. Its worth noting that this is practically meaningless on the subsystem end where its no problem to design mulitple processors to handle the load. Subsystem latencies have much more to do with the controller architecture, including the design of cache, disk operations and queueing mechanisms. The BIG FUD argument that FC mongers like to wave around is that TOE cards are needed in host systems to deal with latency sensitive applications. OK, so use TOE then. Big deal. I think people are going to see that today's inexpensive multi-core servers have plenty of excess processing overhead. Looking forward, its not like Intel, AMD and everybody else making CPUs are going to stop making them faster. The window has practically closed on the "IP is too slow" argument - although the FC priests will continue to sermonize to the faithful.
FC fans are trying to use the fact that very few understand FC - as an advantage compared to Ethernet/TCP/IP - where the operating limitations are common knowledge. That is so weird. We could divide the markets by size, but that would be doing them a disservice. For me, its not so much an argument that iSCSI is going to crush Fibre Channel, but its an issue that people can actually understand what they have. Clarity of understanding is a great thing.
See, I didn't even write about the problems of managing changes in large enterprise FC SANs and dealing with archaic zoning. LUN masking and WWN physical addresses!
Posted by: marc farley | April 13, 2007 at 04:16 AM
Hi Marc -- you and I are destined to disagree, aren't we?
Let's step back from the technology here for just a second.
The first thing that jumps out (supported by multiple, independent data sources) is that any enterprise that has made an investment in FC technology is usually not interested in iSCSI technology.
We can argue about the moral wrongness of this, but it is a fact. In the enterprise, iSCSI has stalled, despite everyone's best efforts.
And that audience includes some very smart IT people who look beyond any sort of marketing smokescreen or handwaving.
And when you go talk to those people, and you ask the question "so why aren't you interested in iSCSI?" they will say that it doesn't behave like the FC they know and love.
They buy in to the economics argument. They know they don't need TOEs. These are smart people, Marc.
They just don't buy into the "differentness" of it. And what follows is a long list of concerns, that are valid -- by definition -- because they are coming from an educated customer who actually has to use the stuff.
So, rather than convince them that they're wrong, why not give them what they want?
How about the cost structure of ethernet, with the behavior model of FC?
Sounds like a winner to me.
Sounds like a winner for current FC customers.
And it should sound like a winner to you, too ...
But, ultimately, neither you nor I will decide this issue.
The market will.
Best regards, and keep up the blogging!
Posted by: Chuck Hollis | April 13, 2007 at 02:57 PM
Chuck -
Nice blog on FCoE. You make a very good case.
I agree with everything you've said except for one point and that had to do with 10GE prices being too high. While in general that statement is true there are cases where Enterprise class Layer 2-3 10GE switches are available for less than $500/port -- a revolutionary price-point.
[product/company plug redacted by editor ...]
Posted by: Vikram Mehta | April 19, 2007 at 01:57 PM
I find FCoE very intriguing. I agree that much of the market placed is driven by perception, and we could argue all day on FC vs. iSCSI, but the fact of the matter is FC has won the PR war in the enterprise.
However, I love the idea of introducing a new standard that is going to try and leverage Ethernet infrastructure, strip out the IP overhead and provide what will be perceived (hopefully) as native FC performance. If the PR campaign for FCoE goes well I could see this as the tipping point for the convergence of a single Ethernet fabric for both.
I also can't wait to see what happens if FCoE becomes a reality and starts to gain market acceptance. For a long time vendors in the FC switch space have been seen as a separate entity from traditional Ethernet vendors. That's changed in the last few years with Cisco, but what next? Every switch and router today knows how to handle IP over Ethernet. If in the future the same holds true for FC, this opens the flood gates to a whole new list of FC switch providers. Brocade is going to have a lot more to worry about than just Cisco. And Cisco is going to have a lot more to worry about than just Brocade.
Posted by: Josh | April 23, 2007 at 05:06 PM
Hi Josh
You and I see many of the same things, don't we?
Thanks for the comment ...
Posted by: Chuck Hollis | April 23, 2007 at 05:56 PM
What a piece of nostalgia :-)
Around 1997 when a team at IBM Research (Haifa and Almaden) started looking at connecting storage to servers using the "regular network" (the ubiquitous LAN) we considered many alternatives (another team even had a look at ATM - still a computer network candidate at the time). I won't get you over all of our rationale (and we went over some of them again at the end of 1999 with a team from CISCO before we convened the first IETF BOF in 2000 at Adelaide that resulted in iSCSI and all the rest) but some of the reasons we choose to drop Fiber Channel over raw Ethernet where multiple:
• Fiber Channel Protocol (SCSI over Fiber Channel Link) is "mildly" effective because:
• it implements endpoints in a dedicated engine (Offload)
• it has no transport layer (recovery is done at the application layer under the assumption that the error rate will be very low)
• the network is limited in physical span and logical span (number of switches)
• flow-control/congestion control is achieved with a mechanism adequate for a limited span network (credits). The packet loss rate is almost nil and that allows FCP to avoid using a transport (end-to-end) layer
• FCP she switches are simple (addresses are local and the memory requirements cam be limited through the credit mechanism)
• However FCP endpoints are inherently costlier than simple NICs – the cost argument (initiators are more expensive)
• The credit mechanisms is highly unstable for large networks (check switch vendors planning docs for the network diameter limits) – the scaling argument
• The assumption of low losses due to errors might radically change when moving from 1 to 10 Gb/s – the scaling argument
• Ethernet has no credit mechanism and any mechanism with a similar effect increases the end point cost. Building a transport layer in the protocol stack has always been the preferred choice of the networking community – the community argument
• The "performance penalty" of a complete protocol stack has always been overstated (and overrated). Advances in protocol stack implementation and finer tuning of the congestion control mechanisms make conventional TCP/IP performing well even at 10 Gb/s and over. Moreover the multicore processors that become dominant on the computing scene have enough compute cycles available to make any "offloading" possible as a mere code restructuring exercise (see the stack reports from Intel, IBM etc.)
• Building on a complete stack makes available a wealth of operational and management mechanisms built over the years by the networking community (routing, provisioning, security, service location etc.) – the community argument
• Higher level storage access over an IP network is widely available and having both block and file served over the same connection with the same support and management structure is compelling – the community argument
• Highly efficient networks are easy to build over IP with optimal (shortest path) routing while Layer 2 networks use bridging and are limited by the logical tree structure that bridges must follow. The effort to combine routers and bridges (rbridges) is promising to change that but it will take some time to finalize (and we don't know exactly how it will operate). Untill then the scale of Layer 2 network is going to seriously limited – the scaling argument
As a side argument – a performance comparison made in 1998 showed SCSI over TCP (a predecessor of the later iSCSI) to perform better than FCP at 1Gbs for block sizes typical for OLTP (4-8KB). That was what convinced us to take the path that lead to iSCSI – and we used plain vanilla x86 servers with plain-vanilla NICs and Linux (with similar measurements conducted on Windows).
The networking and storage community acknowledged those arguments and developed iSCSI and the companion protocols for service discovery, boot etc.
The community also acknowledged the need to support existing infrastructure and extend it ina reasonable fashion and developed 2 protocols iFCP (to support hosts with FCP drivers and IP connections to connect to storage by a simple conversion from FCP to TCP packets) FCPIP to extend the reach of FCP through IP (connects FCP islands through TCP links). Both have been
implemented and their foundation is solid.
The current attempt of developing a "new-age" FCP over an Ethernet link is going against most of the arguments that have given us iSCSI etc.
It ignores the networking layering practice, build an application protocol directly above a link and thus limits scaling, mandates elements at the link layer and application layer that make applications more expensive and leaves aside the whole "ecosystem" that accompanies TCP/IP (and not Ethernet).
In some related effort (and at a point also when developing iSCSI) we considered also moving away from SCSI (like some "no standardized" but popular in some circles software did – e.g., NBP) but decided against. SCSI is a mature and well understood access architecture for block storage and is implemented by many device vendors. Moving away from it would not have been justified at the time.
Posted by: Julian Satran | April 24, 2007 at 02:54 PM
Hi Julian
One thing I've noticed about IBM people is that we're always treated to a history lesson on any topic that comes up.
Don't know if they train you to do this, but it seems consistent. In this case, it's definitely interesting, although I got a slightly different retelling of events from some other folks.
And I always have a tough time creating a relevancy bridge between the history lesson, and where affairs are today. This is no exception.
I think IBM had the right idea with early adoption of iSCSI. Maybe you were ahead of the market with the first products. Too bad the customers weren't there, and you had to withdraw your early offerings.
As I perceive it today, IBM has very little in the way of iSCSI offerings outside of its reseller agreement with NetApp. So much for history and theory.
My main philisophical debating point would be the need for a rich networking protocol stack (e.g. routing, et. al.) to solve a fairly straightforward storage connectivity problem.
Much ado has been made about FC (and FCoE) lack of routing over extended links.
My consensus view (speaking albeit from an EMC and large enterprise perspective) is that routing and similar capabilities do not need to be an integral part of any storage connectivity approach.
Historically, they seem to have caused more problems than they have solved.
And when I look at how customers are using FCoIP and related approaches, there doesn't seem to be much concern about adding routing as an accessory, rather than a fundamental attribute.
The great part about this debate is that neither you nor I will decide this.
Much like iSCSI, it will be decided in the marketplace by paying customers, and I for one can't wait to see how this one plays out.
Well constructed post, Julian, so thanks!
Posted by: Chuck Hollis | April 24, 2007 at 08:30 PM
Hi,
This is a very interesting blog, thanks! I'm just wondering something:- Does FCoE adhere to common FC practices such as WWN Initiator/target zoning? If so, how will my current ethernet switches be able to implement this? Will I need to purchase new switch models that can do FC zoning as well as being able to carry standard TCP/IP network traffic?
Posted by: George Parker | October 15, 2007 at 08:05 PM
Hi George, thanks for reading and commenting.
I am not an expert here, but I will share what I think I know.
The goal will be to have FCoE look and smell just like FC, except using raw ethernet as a transport. I would fully expect all common FC practices to be fully supported in a transparent manner, including WWW initiator / target zoning et. al. If it doesn't, then it won't meet market requirements, and it'll be DOA.
As far as specific switch hardware support for FCoE, I think that's up to the vendor you're using. I would not be surprised if we're talking New Stuff all around here. At least, you can make sure there's some sort of upgrade path.
Posted by: Chuck Hollis | October 16, 2007 at 11:50 AM
Hi Chuck, Thanks for sharing with us your view about FCoE.
You must already knew the FCoE demo by QLogic, NATP, and Nuova(Cisco) in SNW conferences.
I also saw Intel's recent news release saying their latest 10G Ethernet chip will fully support FCoE (http://www.intel.com/pressroom/archive/releases/20070928fact.htm?iid=search).
Intel is one of the two major NIC vendors (the other is Broadcomm), and those vendors involved in SNW FCoE demo mentioned above are also big names. It gives me an impression that the big players are acting very fast to push FCoE technologies, considering the fact that FCoE draft was just proposed in April.
Posted by: Evelyn | October 19, 2007 at 03:35 PM
Hey Chuck,
What do you think of Cisco's new NEXUS platform and the announcement of future support of FCoE? What is EMC's positioning on supporting the NEXUS? How will this change the Support Matrix concept, when the Network is no longer FC etc?
Keen to get your's and EMC's read on this new product.
Cheers,
Greg.
Posted by: CIsco NEXUS Platform | February 11, 2008 at 08:58 PM
Hi Greg, how are you?
I am personally a fan of the Nexus platform on several levels -- there's a lot of good strategic thinking that went into it, and I could go on for quite a while.
You raise some good points around FCoE and qualification activities.
First, if you're a regular reader of this blog, you'll know that I'm a big fan of FCoE in large enterprises. It just works for me on so many different levels. I keep a close eye on what EMC is doing in the space because I think it's important.
As far as the qualifications go, we'd like to get to a "just works like ethernet" model, but we may not get there in the first go-around. It's very possible that, as with any early standard, there's a fair amount of qual/interop work to be done at the outset, and far less as the standard (and implementations) mature.
Personally, I'd like to see a world where storage network interop was more like data network interop, e.g. it just works. FCoE looks like it'll get us closer to that than FC, iSCSI, etc.
Thanks for writing!
Posted by: Chuck Hollis | February 12, 2008 at 12:53 PM
Hi Chuck,
We currently do fibre-channel but are looking to change the way we do storage (ISCSI). We are looking at bringing in the Cisco Nexus switch, do you think FCoE will start arriving?
Thanks,
Paul
Posted by: Paul | February 20, 2008 at 03:57 PM
Many FCoE products were announced on today's SNW. I think this may mark the FCoE is ready.
Posted by: evelyn | April 08, 2008 at 09:58 PM
Chuck,
Something that isn't mentioned here is IHV adoption. You're right to say that FCoE HBAs on the motherboard are the key to driving down price. But the question is, are the IHVs ready to put an FCoE HBA on their motherboard versus a plain-vanilla NIC? If the answer is no, the game is up. Standup FCoE HBAs will never be at the price points that LOM will be.
Even if FCoE HBAs do make it to the motherboard, will they ever be as cheap as a plain vanilla LOM? We all know the answer to that (no).
So, in the end, the IHV -- who have slim margins on their product -- will find a way to cut costs by putting a cheaper NIC on their motherboard.
I won't go into the likelihood of vendors shipping LOM (I'll post in another blog), but look at the investment the current IHVs have in iSCSI (Dell-Equallogic, HP - MSA2000i). That's capital not easily thrown away.
Posted by: Drue Reeves | April 10, 2008 at 07:01 PM