« IT Is Inherently Green | Main | Updates To The Capacity Post »

August 28, 2008

Comments

Stephen Foskett

Hooo boy, Chuck, you do like walking into minefields, don't you? I want this kind of apples-to-apples comparison to work, but it'll never be fair... See my thoughts on the topic:
http://blog.fosketts.net/2008/08/28/grapples-and-tangelos-why-its-impossible-to-compare-fairly/

Chuck Hollis

Hi Steve, I wanted to leave a comment on your blog, but didn't feel like signing up to do so.

You brought up some interesting points, as follows ...

Steve writes: Does EMC really support using the five vault reserve disks on a CLARiiON to hold production data?

Answer: yes, we do.

Steve writes: Would EMC really suggest 8+1 RAID 5 for a production Exchange and SQL Server environment?

Answer: yes -- and we've got the test data to back it up. The performance and availability characterizations are publicly available, I think.

Steve writes: Is one hot spare per two DAEs (30 drives) really sufficient for a whole pile of 9-disk RAID 5 sets that are maxed-out with production data? I’d feel much more comfortable with a few more spares with such large RAID 5 sets.

Answer: the configuration has 4 global hot spares, so it's not useful to think of it that way. Also, the proactive sparing algorithm means we usually get to a drive before it totally fails. CLARiiON engineering considers this recommendation "conservative".

However, please add as many drives as makes you feel comfortable. We can always make more :-)

Steve writes: There is no way 14+2 RAID DP is equivalent to 4+1 RAID 5, let alone 8+1. It’s in a different league of reliability.

Answer: although we'd disagree with that statement in the presence of global hot spares, we just went with what each vendor recommended.

Steve writes: Yeah, NetApp’s space reserve recommendation stinks. But you probably won’t need 100% in production - the real amount is something one would work out when testing and piloting and is probably substantially less than this.

Answer: gee, NetApp's pitch is all about simplicity. You mean I have to run a bunch of trials to get to the optimum setting? That wasn't in the marketing deck I saw!

Also, let's not forget that the penalty for getting things wrong is a catastrophic application crash, which is pretty severe ..

Steve, we simply went with what each vendor recommended, and was willing to support. There are no value judgments here on "right" or "wrong".

I know you think I'm putting my neck out there, but I think this is a good discussion to have.

The differences in approaches are striking, wouldn't you agree?

-- Chuck

ripvan

Chuck,
if this is really how you would suggest that I as a customer should set up my CX array for production, please configure an array in this manner and suject it to benchmark testing and post the results to

http://www.storageperformance.org

Yes - benchmarks are not perfect, but they are certainly the best source to get some comparitive information about available arrays on the market instead of listening to your local disk peddler.

Chuck Hollis

I think you might be misunderstanding what the SPC is all about.

The goal is for vendors to configure their arrays for optimum results on the SPC test, disregarding things like availability, recoverability, real-world application profiles, manageability, cost-effectiveness, etc. It's all about the number, baby ...

Oh so many problems with the SPC. I really don't want to cover all that again. Go look a bit closer, and then we'll chat in detail.

BTW, check out the latest SPC flimflammery from IBM here: http://news.cnet.com/8301-13924_3-10028216-64.html?tag=mncol

Thanks for writing.

Lee Razo

Ahhh thanks for that, Chuck. You made my day.

As one of NetApp's competitive analysts, I always look forward to checking in with you to find out what the latest variation of the EMC shell game is. You should seriously consider going into politics.

Why is it that whenever you guys talk about capacity, you manage to artfully avoid any detailed discussion of performance or resiliency? For that matter, the same seems to happen when you talk about performance. Or resiliency. Why is it that any talk of performance, resiliency or efficiency are never seen in the same room together at EMC?

In this case you even casually mentioned in passing that NetApp FAS comes standard-configured with high-resiliency, zero-penalty RAID 6 - as if CLARiiON RAID 5 was even in the same league (I won't even go into the snapshot performance impacts - the real reason why you guys don't like SPC-1)

Don't worry, though, Chuck. The good news is that the EVA is far worse than the CLARiiON by all measures. You were quite generous in your comparison even though the EVA configuration you described would be barely usable after the first snapshot due to EVA's own unique inefficiencies.

Seriously though, I'd be more than happy to give you a few pointers on how an EVA (or a NetApp FAS for that matter) works in the real world.

It'll be fun, we could compare it to the CLARiiON and make a drinking game out of it - you have to take a shot every time there is a caveat in your best practice guide.

--Lee

Chuck Hollis

Glad I made you happy. Go back and read the first third of the post again, and you'll get the answer to your first observation.

As I pointed out in the post, the idea behind this is that each vendor makes recommendations for Exchange environments that provide the best combination between performance, capacity, availabilty, recoverability and manageability.

EMC makes ours, HP makes theirs, NetApp makes yours. All three vendors publish recommendations for this particular use case, which I found interesting.

The purpose of this exercise was not to say that any one vendors' recommendations was "bad" or "good", just to look at each of them from the perspective of capacity efficiency.

And in block mode environments, y'all don't seem to stack up so well. Maybe you should stick to smaller filers storing tier 2 data.

Best regards!

Lee Razo

That all sounds plausible, but you actually need to choose current, and "like for like" best practices, not just cherry-pick the ones you like.

To name two:

1) If you're going to use RAID-DP on a NetApp then you either need to use RAID-1 on the CLARiiON (and sacrifice half the capacity) or RAID-6 (and sacrifice 60% of the write performance). No such sacrifice needed on the Netapp

2) Your NetApp snapshot information is outdated. You are not required to reserve 100% for block snapshots nor is it necessary to do any kind of "trial and error". We've supported the functionality to automatically manage snap reserve the same way the CLARiiON or EVA would for quite some time now.

Seriously, man. Get with the times! Technology marches on.

--Lee

Chuck Hollis

Sorry, nice try, but no cigar.

Like I said before, we went with each vendor's published recommendation. You don't like EMC's and HP's use of RAID 5, you need to change NetApp's standard recommendation of RAID 6. Feel free to argue your case that it's "better", but that's not the point here.

If NetApp has updated its recommendations for Exchange environments in terms of snap reserve, please point me to the appropriate documentation, and I will be more than glad to update this piece.

Maybe technology marches on, but perhaps you need to update your Exchange recommendations?

Most of your customers default to what the vendor recommends, which costs them big bucks, which is exactly the point of this comparison.

Cheers!

TomT

The flaws in your analysis of the EVA are so blatant that reviewing just one of the documents you cited will make this abundantly clear. You have disregarded most of the key recommendations presented in "HP StorageWorks 4x00/6x00/8x00 Enterprise Virtual Array Configuration Best Practices White Paper". These recommendations are not easily missed - they are repeated multiple times in the text, listed in tables, set off from the main paragraphs and even highlighted in color.

While there are many, the oversight that invalidates your EVA analysis more than any other is your use of disk groups. Configuring 7 disk groups is contrary to one of the most fundamental best practices for the EVA. As clearly stated in the aforementioned document:

Best practice to minimize the cost of storage: Create as few disks groups as possible
Best practice to optimize performance: Configure as few disk groups as possible

Besides contradicting HP's recommendations, configuring an excessive number of disk groups is not representative of real world EVA implementations. The vast majority of EVA arrays ever implemented have been configured with one or two disk groups.

Your use (or misuse) of disk groups dramatically skews the capacity efficiency calculations to the point of being ridiculous.

My advice to you....next time RTFM.

Stephen Foskett

I've had EMC SE's tell me that the vault reserve disks should only be used for low-I/O static data or not at all to avoid system-wide performance impacts. Were they wrong?

I've also had them tell me that one should configure twice that many spares at least... Overly cautious?

I love the idea of trying to set up an apple-to-apples comparison based on usage, but think EMC's not the best person to make the judgment. How about we have a third party set up a scenario and let each vendor respond with their own proposed configuration, using their own knowledge of best practices and experience to decide how many drives to reserve for this and that?

Chuck Hollis

Hi TomT -- would you mind posting a public link to that HP document?

It'd be very helpful -- thanks!

As far as the number of disk groups we used, we saw multiple references to the need to isolate different workloads on the EVA for performance reasons.

At 120 (usable) disks, it's highly unlikely that it'd be all supporting a single application, wouldn't you agree?

If you go back to the post, examine the specific use case we outlined, and have a different number of HP-recommended disk groups, would you mind sharing that here?

Thanks for your comments.

And, trust us, we did RTFM -- in detail.

-- Cheers!

Chuck Hollis

Hi Stephen

Yes, that used to be the recommendation, but no more. My understanding is that these drives are used infrequently. I'll get someone more authoritative to back me up on this in a moment.

As far as over-configuring by field personnel in the interest of excessive caution, it's a phenomenon that's probably not limited to EMC. We publish engineering-supported recommendations for customers and field personnel alike. Sometimes people think they know better :-)

As fas as having a third party do all of this -- yes, you have a point -- but no obvious candidate springs to mind.

And we don't want a replay of the SPC fiasco.

Thanks!

TomT

Chuck,

The Best Practices document to which I referred is the same one you listed under HP references....you know....the one you read...in detail.
http://h71028.www7.hp.com/ERC/downloads/4AA0-2787ENW.pdf

When I have some time I'll see about addressing your other questions....gotta take off.

TomT

Chuck Hollis

Hi Tom T:

Good, so we're talking about the exact same document!

So, do you want me to cut and paste the sections we read in it regarding the need for performance isolation between different disk groups (hence leading to our configuration of 7 disk groups for 120 usable disks, which HP users tell us is typical), or would you choose to pursue another angle on this?

Let me know ... thanks!

Jen LeClair

The first four drives of the vault contain the SP’s operating system. After the storage system is booted, the operating system files are only occasionally accessed. Vault drives may be used for moderate host I/O loads, such as general-purpose file serving. For these drives, restrict usage to no more than shown:

MAXIMUM HOST I/O LOADS FOR VAULT DRIVES
Vault HD Type Max. IOP Max. Bandwidth
Fibre Channel 100 10
SAS 100 10
SATA 50 5
SSD N/A N/A

In general, when planning LUN provisioning, distribute busy LUNs equally over the available RAID groups. However, if LUNs are hosted on RAID groups composed of vault drives, when planning assume a busy LUN has already been provisioned on the drives. This will account for the CLARiiON’s vault drive utilization.

For example, assume five equally busy LUNs need be provisioned. The LUNs need be allocated to three RAID groups. One of these RAID groups is necessarily composed of the vault drives. Place two of the new LUNs on each of the two non-vault drive RAID groups (for a total of four LUNs). Place a single new LUN on the vault drives RAID group. Why? Because, before the new LUNs are provisioned, the RAID group built from the vault drives already has a busy LUN provisioned from the CLARiiON.

Recommendations for AX4-5 and earlier CLARiiON storage systems:

For AX4-5 and the earlier CLARiiONs, the CX4 considerations above apply. The reserve LUN pool, clone, and mirror LUNs should not be placed on the vault drives.. Also, avoid placing any response-time sensitive data on the vault drives.

A vault drive failure will normally result in rebuild activity on these drives and a disabled write cache. The HA vault option in Navisphere should be clear, if response-time sensitive access to data on these drives is needed under this condition. With the HA vault disabled, the cache stays enabled during a vault drive rebuild, at the expense of a small amount of protection for cached data. This is especially important with systems using SATA drives in the vault, due to the long time needed for these drives to rebuild and for the cache to re-enable.


BrianS

I would be glad to post the link to HP EVA best practice guide.

http://h71028.www7.hp.com/ERC/downloads/4AA0-2787ENW.pdf

As for the definitive answer on the EVA disk groups. Best practice is to always have as few disk groups as possible. Typically heavy sequential versus heavy random is split which typically means transaction logs get moved to a different disk group. Which is an availability best practice. I have deployed dozens of customer EVAs and typically no more than 1 or 2 disk groups are used or needed. The magic of the EVA is the ability to stripe across all of the spindles in the disk group. This obviously reduces array group specific hot spots that traditional arrays have to deal with especially when workloads change unexpectedly, however in very specific cases requirements may dictate guaranteed service levels of a certain workload and then it may warrant a deviation from this practice. In the case with exchange 2003 this is evident on the lack of heavy database caching due to the 32bit nature of the code base. This requirement then translates into direct requirements for transaction response times needed from the array to provide a reasonable user response. Exchange 2007 greatly changes the I\O requirements with the move to 64bit. Rule of thumb is that IOPS\user requirements on Exchange 2007 went from 1.0iops\user to as low as .3iops\user. This represents a significant change in exchange planning and reduces the hard requirements for response times. Here is the EVA Exchange 2007 best practices. http://h71028.www7.hp.com/ERC/downloads/4AA1-4704ENW.pdf

In the end in an EVA it comes down to IOPS. And an EVA has a big bucket of IOPS for mixed workloads to fit into then several smaller buckets of a traditional array.

Chuck Hollis

Thanks Jen!

Chuck Hollis

Thanks, Brian, very useful.

All storage arrays are "big buckets of IOPs", right? And differences in I/O profiles between Exchange 2003 and 2007 pretty much play out for everyone the same way.

My impression is that most EVAs are somewhat smaller than the configuration discussed here, where your "1 or 2" comment might be appropriate.

As mentioned in the post, please consider a singificantly larger configuration (e.g. 120 usable disks) where you have multiple performance-intensive applications and desire a degree of workload separation.

Would you offer up a different suggestion?

Thanks

John Ashton

Chuck,

Funny how you gloss over anything about a mixed workload requirement on the CX array. You simply state that there are best practices for other applications and foolishly claim no such practices exist for competitive products. Of course this claim can be easily dismissed by anybody with access to Google and half a brain.

Try plugging in "SAP EVA best practices" or even "SAP NetApp best practices" for example.

Way to increase your credibility Chuck! So now customers are going to think you are an authority on the competition?

I certainly appreciate the fact you decided to include multiple workloads on the arrays in question. After all this is the sort of thing most benchmarks overlook. They quaintly assume if I buy an array I only run one application at a time on it. Perhaps if my SAN had the same price point as those pesky servers but I digress...

So now we have three arrays running Oracle & SQL workloads.

Just one problem.

You didn't do the research to see what happens when you actually add these workloads.

It is so much more fun to extrapolate isn't it? I mean after all the requirements for Exchange are idneitcal to Oracle and SQL right?

You don't apply EMC best practices for these additional workloads. Heck your own engineers would chastise you for doing this...

So at the end of the day your "efficiency math" is based on all disks allocated to Exchange for the CX.

But wait a second. What about the competition?

Can we penalize them? You betcha!

Watch as the "comparison" unfolds...

One glaringly obvious one I just have to ask a question about:
That lovely EVA multicolored chart has a "file share". Wait a second is this a typo? After all I could have sworn you wrote:

"Conversely, I don't think it's reasonable to extend this sort of comparison to file serving"

So some clarification would be nice...

So many questions Chuck...
How did you come up with the workload?
Are you applying EMC best practices to the additional workloads on the CX?
Why is there no mention of Oracle or SQL workloads (and that pesky file share) for either the NetApp or EMC arrays?
What sort of configuration are you optimizing for here? Availability? Performance? Cost?
Were these configuration goals communicated?
Were the same goals applied for each array?

As you can see plenty of questions an unhealthy amount of spin and very little in the way of answers.

Your title is VERY accurate....your user capacity may vary...

Especially with:
* ambiguous goals
* a lack of transparency
* a fantasy marketing comparison v. real world configurations

But hey, I'm not a VP just one of the schmucks that has to make sure all of this stuff actually lives up to the hype spread by marketing types to clueless executives...

What's your excuse?

Chuck Hollis

Hi John

Although you might be making some valid points, it's very hard for me to get past the snide, insulting and sarcastic tone.

Is this the way you routinely speak to people you interact with?

Scott Fuhrman

It's important to note the noticeable difference in usable disk space on a Clariion vs. a Symmetrix.

Take two identical, 300GB FC drives.

On CX arrays, we get 288,195,870,721 bytes, or 268.40332 gigabytes usable per 300GB drive.

On DMX arrays, we get 585,937,499 blocks * 512 bytes/block, equaling (299,999,999,488 bytes) 279.396772 gigabytes per usable drive.

So on a CX, we lose 11GB more per drive that we do on a Symm. That's 3.7% right off the bat. It's important to include the details.

Chuck Hollis

Ok John -- you deserve a response.

1. Your first point stems from a less-than-careful reading of what I wrote. I simply said many vendors don't supply this material. I did not claim that HP nor NetApp were in this category.

Yes, I've seen the docs you reference. The HP one is good, the NetApp for FC strikes me as fluffy.

2. EMC has characterized multiple workloads against all of our arrays, it's an essential and ongoing part of our business model. Many mountains of detail to share here if anyone is interested ...

We have not done the same for competitive arrays in a consistent fashion, though. For this capacity exercise, we focused on the need for workload isolation. In the cases of CX and FAS, there was no capacity impact. For EVA, alas, workload isolation tends to cost you in terms of capacity efficiency.

3. The error you point on the pictures is my fault. I grabbed an earlier version of the GIF file (same numbers, different headers), I will go fix that -- thanks.

4. As far as consistency in workloads between the three, I could have been more clear here.

In terms of configuring for availability and protection, our Exchange recommendations are almost identical to our Oracle, SAP, et. al. Use it all for Exchange, use it for a mix of 8 different applications which all require decent performance and availability (as well as workload isolation), all the numbers regarding capacity efficiency are virtually the same.

5. In terms of goals, I think the post is self-explanatory: configure 120 disks of usable capacity for a performance-sensitive and availability-sensitive environment (e.g. Exchange), and see how capacity efficiency stacks up. Since all three vendors have relatively current recommendations for Exchange, we have a rough standard between the three.

If it was all about performance, there'd be different recommendations from all three vendors. If it was all about cost, same thing. And if it was all about uber-availability, ditto.

The point I'm making is that each vendor chose their optimum saddle point between the three. You're free to argue that various vendors are making the wrong tradeoffs with their recommendations (e.g. Stephen Foskett's point), but that's a separate discussion, isn't it?

Please go re-read what I wrote, and, if you still have concerns, please let me know.

-- Chuck

Martin G

Okay, user here!!! We have just been through an exercise comparing various vendors and one of the metrics was how efficient each vendor used RAW disk for capacity so that we could work out how much it would cost to write a true terabyte of useable DATA to a disk. Comparing Raw disk costs is a mug's game and although I would struggle to hit Chuck's figures for NetApp; we got it down to about 50% for NetApp (including Waffle overhead, Snapshot Reserve of 10% (I think, don't have my calculations with me, Hot Spares, RAID-DP etc). For EMC, we were pretty close to Chuck's figures. To be honest, IBM's figures for DS4xk weren't far off 60 odd-percent either.

We have DMX, CX, FAS and DS all on the floor and they all do a job. I'd always try to avoid using the FAS as a Fibre-Channel array, it's not its sweetspot. And for NAS, we are leaning to a V3170 in front of a CX4. So you both win ;-)

Chuck Hollis

Thank you Martin!!!

Now, if you could go back and check your FAS numbers and include all that RAID-DP stuff and other overhead? It'd be *very* useful -- thanks!

Chuck Hollis

Hi Scott -- interesting findings between DMX and CX.

Part of that (1.5%?) is due to the difference between the 520-byte sectors used on the CX, I believe. Don't know where the rest of it is coming from, though. I'll go ask one of our superduper experts about this ... because I'm interested as well.

Thanks for writing!

Chuck Hollis

TomT:

Ah yes, I found the passage in the HP EVA documentation I was looking for.

According to HP documentation, there is no advantage to workload isolation on the EVA (something I would seriously doubt), but there is an availability advantage.

Page 13:
"Best practice to optimize availability: For critical database applications, consider placing data files and recovery files in separate disk groups."

"Best practice to optimize availability: Assign snapclones to a separate disk group."

Still, if I've got application A lighting up all the spindles, wouldn't I want to keep it separate from applications B, C and D?

Or is the design point of the EVA more egalitarian, e.g. no way to manage QoS on the device?

Thanks ...

John Ashton

1) No I don't think I mis-read your content. You stated:

"Although many vendors don't publish recommendations for other high transactional-rate applications such as Oracle, SQLsever, SAP etc. (EMC does, though) I think it's reasonable to extend Exchange findings to these use cases."

As I pointed out you can find recommendations on the applications mentioned for the vendors compared.

2 & 5)

You stated:
"For this capacity exercise, we focused on the need for workload isolation. In the cases of CX and FAS, there was no capacity impact. For EVA, alas, workload isolation tends to cost you in terms of capacity efficiency."

Then you go on to state:
"If it was all about performance, there'd be different recommendations from all three vendors. If it was all about cost, same thing. And if it was all about uber-availability, ditto."

Workloads can be isolated in different ways. Using an EVA you could create separate disk groups (as you did) or vDisks which would be the equivalent to LUNs on an EMC or NetApp environment.

vDisks still provide isolation for workloads and since all performance is striped across all disks it provides a good balance of performance and isolation. If you were trying to achieve a good balance between isolation and performance fewer disk groups would make sense. The point here is that the workload is assumed to fit in the performance boundaries of each array which means building multiple disk groups is a choice - not a requirement.

So why not use vDisks instead of disk groups?

This would allow for a better "apples to apples" comparison. Of course I am assuming that was the point of this exercise and not simply maximizing your positioning...

You can argue that best practcies should be updated and everyone could probably stand updates. After all customers are moving to Exchange 2007. Still is the point to compare disk arrays or best practices?

While it may not be optimized for the absolute in performance you clearly state:

"In terms of goals, I think the post is self-explanatory: configure 120 disks of usable capacity for a performance-sensitive and availability-sensitive environment (e.g. Exchange)"

Which means ultimately you have multiple competing goals which require balance. If the true goal was just about performance you would have used RAID 0. If it was truly about availability you would have used RAID 1.

Chuck Hollis

Hi John

1. Point noted.

2. There seems to be some debate as to whether vDisks provide true workload isolation, e.g. IOs from application A will never interfere with IOs from application B. Your thoughts?

3. Maybe one could argue that we used too many disk groups on the HP, but there's an availability aspect to this as well, as noted in previous comments.

If it were you, and you looked at the brief description I provided for the environment above, how many disk groups would YOU configure in a 120-usable-disk EVA?

Thanks.

Stephen Foskett

A-ha! Jen LeClair supplies the missing info: "avoid placing any response-time sensitive data on the vault drives." That's what I was remembering, and probably what the EMC SE's were referencing.

Although this discussion is seriously flawed because it takes place on a vendor's blog, I think it's an incredibly valuable one to have. The fundamental point is correct: No two arrays are alike enough to compare on a numerical disks-and-IOPS basis. As is Chuck's comment that each vendor makes their own decisions regarding performance, cost, and reliability (pick two, right?)

The only valid comparison comes from adequately specifying your technical requirements and having each vendor suggest their own solution. EMC can never make NetApp or HP happy suggesting how to use their systems, as evidence from the predictable responses here.

Chuck Hollis

Stephen, now you're understanding my motive here -- I want knowledgeable users to ask informed questions from their vendors -- and demand intelligent responses -- nothing more.

Unfortunately, the only tool I have at my hands to do this is my (flawed) vendor blog.

-- Chuck

John Ashton

2. The architecture of the EVA allows you to spread all IO traffic across every spindle. The traffic is isolated at a physical level.

Further isolation can be had by creating disk groups. These groups essentially put a fence around physical drives. Allowing traffic to run across all drives in a group but not the array.

Creating a vDisk simply abstracts the isolation from the physical boundary. Think of it like striping together multiple physical disks into a LUN.

Since most people don't know what to do in this model, carving an EVA up into multiple disk groups is the easiest choice.

The EVA 8000 actually maintains enough performance to meet the latency requirements of Exchange 2003 in a single 120-disk array group. If you wanted to create additional isolation you could create a different vDisk for logs rather than another disk group. It all depends on the log transaction rates, mailbox sizes, items per mailbox, etc.

Since you didn't provide any performance attributes for additional workloads I might build a disk group to isolate Oracle & SQL workloads with vDisks for data and log space in a disk group.

But then again all of this is theoretical...

Chuck Hollis

Thanks, John, I was starting to doubt my understanding of the EVA a bit there ...

If you want "hard" IO isolation, you create disk groups. Or additional availabilty, per HP's documentation.

If a given disk group is presumed to have enough IO and deliver short enough response times for all applications, carve it with vDisks for different apps. No IO isolation here, though ...

If not, additional disk groups are called for.

And that's where it gets subjective, no?

Thanks!

Bibble Chubber

Chuck, you make me laugh!

Joe

Hey Lee,
Based off of the system I have, using defaults, when I create a 16-drive RAID DP using 500GB SATA drives I only get 4.07TB usable. Do the math. That's about 50% of "raw" that is usable before even considering spares. That sucks. NetApp is pathetic.

As for "If you're going to use RAID-DP on a NetApp then you either need to use RAID-1 on the CLARiiON (and sacrifice half the capacity) or RAID-6 (and sacrifice 60% of the write performance)" are you joking? Seriously, NetApp NEEDS RAID DP because the drive rebuilds are SO long. How long does it take to rebuild a 1TB drive? on a NetApp box? Your best practices say to use RAID DP for any drive greater than 146GB. Are you kidding me? That's ridiculous. You are the ONLY company that demands "RAID 6" protection for FC drives. What's wrong with your architecture? Could it be that WAFL is REALLY old and just doesn't work well in a FC/iSCSI SAN?

And please don't reply with "We have Veritest reports showing how fast we are". Everybody knows they were baked.

John Ashton

You nailed it.

"If a given disk group is presumed to have enough IO and deliver short enough response times for all applications, carve it with vDisks for different apps"

Then again if you are meeting your target do you really need isolation? Think of it this way:

You have 120 146G drives
Select your sparing and RAID level to subtract the overhead.

Now you decide you need some IO isolation. How would you do this on a traditional array?

String together disks into a LUN. Lets say 16 146G drives. Dedicate your spare disk in the LUN (leaving 15) and off you go.

Run into a performance problem and what happens?

You could always re-build your stripe to more closely emulate your traffic after all most people grossly over/under estimate performance. But hey that's messy and takes a lot of time. In most cases we take the easy way out and add spindles.

Now can you add more disks to that single LUN?

Maybe, maybe not.

If you can, great. If not, now its time for another LUN and some sort of concatenation software, presentation, putting down volumes and or file systems etc. Basically all the stuff we hate to do and give to server admins if we can help it...

Now consider this...

That single LUN has a performance boundary right? 16 drives. It is only going to be as fast as those 16 drives spin - ever. By the time you add all the pesky stuff mentioned above you managed to slow it down...

Well what if you didn't have a LUN limitation? What if you could make a LUN as big as your array? How fast would it be?

Using the same laws its just as fast as the physical disks. So if you have 120 disks that is as fast as it will ever get.

Here's the bottom line:

Not using disk groups does not mean you don't have isolation. It simply means that isolation is linked to the array rather than groups of disks.

Mike Shea

Chuck,

As a former EMC employee, and a current NetApp employee, I have to say that this post just has me wondering what has happened.

There was a day where if a person 'in the know' said something (like Lee Razo above on teh topic of NetApp SnapShots and disk sizing) spoke, you accepted the professional expertese and moved on.

That is what made you such a steller EBC paritcipant in the past. You used to stick with what was important and out of these crazy ratholes... or is it a minkhole now?

What happened to you sir?

Joe

"I want knowledgeable users to ask informed questions from their vendors"

OK.
-- What happens when a vault drive fails?
-- What happens in the case of a double drive failure (0,0 & 0,2 or 0,1 & 0,3)?
-- What happens if 0,0, 0,1 & 0,2 all die. Rare, but what happens?

Calvin Zito

Thanks Chuck for stimulating an examination of EVA Capacity Efficiency. We've posted a complete response on our blog at http://www.communities.hp.com/online/blogs/datastorage/archive/2008/08/29/emc-distortion-about-capacity-efficiency.aspx

Calvin

Ahmad

John,
I was a CE working with all Digital and Compaq products, I had done a Demo for customer on MA8000 (which I love) and he was extremely happy with the performance, when the customer bought EVA, he ran into rage...
Simply, give me a decent configuration for a small EVA with 18-24 disks to run a small oracle, exchange and small SQL..

EVA is the right equipment for a shop that has no problem with down time and enjoy easy configuration..

I remember that I had to bring an EVA8000 down to upgrade the Disk drives firmware...The other alternative, was to remove each disk from the disk group to upgrade it seperately (you know that this can take a month)

Please search on your knowledgebase, and let me know how many times you loose your qourum data (which was something like 16 copies when I was working with EVA)...


BTW, I was ASE Storage

Ahmad

one more input in the subject,

Chuck's team are not right to choose 7 DGs as it is not recommended by HP training courses and BP documents.

96-120 Disks in a DG is the optimum number (according to one of the HP fellows if I recall correctly)

BUT, I am wondering how much time such a disk group needs to be re-leveled after a failure or after adding new disks? How will this impact the availability as Chuck stated..

http://forums12.itrc.hp.com/service/forums/questionanswer.do?admit=109447627+1220062834862+28353475&threadId=990314

MarinaG

Hmmm... so it's OK for you to be snide, insulting and sarcastic, but you're insulted when posters respond in kind? If you make provocative statements, you should expect provocative responses.

As a VP I'd think you'd have a thicker skin. Or be less defensive.

Take the high road Chuckles. :-)

Cheers!


Chuck Hollis

MarinaG

There are different standards of conduct in the real world, for example, some people are completely socially inept.

The same holds true for online interaction.

I do not insult people, belittle them, call them names, or personally attack them.

I wish other people would do the same.

Chuck Hollis

So, everyone, in regards to our HP numbers, the real issue is EMC's use of too many disk groups in the configuration.

We were lead to believe from HP documentation that if you wanted performance isolation and availability isolation between multiple applications, you need multiple disk groups to achieve this.

A few commenters agreed with this interpretation. HP does not, though.

If we've configured the EVA inappropriately, our numbers are wrong, and we need to go fix that.

I'll get back to you soon -- thanks!

Geert

You mind publishing links to your best practice guides you seem to have taken your data from? You are happily asking others to provide these, but for some reason you've failed to do the same for your own (I'm not a native English speaker, so I don't remember that saying about the pot and the kettle... sorry).

And as for accurate reading (and presenting), here's a correction on your hot spares calculation for FAS (taken from the NetApp's Storage Best Practices and Resiliency Guide you mention, to be found here: http://media.netapp.com/documents/tr-3437.pdf), it states the following:

- Maintain at least *one* hot spare for each type of disk drive in the storage system
- Maintain *two* hot spares for each type of disk drive in the storage system to take advantage of Maintenance Center.
- NetApp recommends using two spares per disk type for up to 100 disk drives. For each additional 84 disks above that, another [*not* two] hot standby disk should be allocated to the spare pool. In a 364 disk configuration this means ~5 hot spares, *not* the 10 you mention [check the table on page 13 of the guide].
- Aggregate snapshot creation can be disabled and the reserve lowered, unless a MetroCluster or SyncMirror is used, which is not the case in your example.

That's it for now, I won't even go down the path of dissecting all other inaccuracies in your "independent" comparison, not will I go into all other aspects of space savings, utilization & performance increasing and data protection functionality FAS systems brings to market; I'll save that to my chats with customers and prospects.

egenda99

Chuck,

I think the true answer with NetApp is more like 399 drives for 17.5 TB usable. Quick point - 144 GB NTAP drives delivers rightsized 119 GB usable (some bloggers say up to 125 GB usable). I am pretty sure that 100% Snap Reserve is AFTER parity calcs and aggregate and snapshot (10%) overhead. So you need to get to 35 TB "raw/usable" to get to 17.5 TB "actual/usable". It takes about 399 drives @ 119 GB with 14+2 parity and a spare every 84 drives to hit your 17.5 TBs which is 120 @ 146 GB. Oh yea, you now have to manage 3 volumes/aggregates since you maxed out the 16 TB (RAW) limit.

pete mitchell

Chuck,

You're such a tool.

Chuck Hollis

Egenda 99 -- thanks for the insight, we'll correct things for the future.

Patrick

Hello Chuck,

mmhh... Do not trust any statistics you did not fake yourself. ;) Did you read the best practise guide for the EVA or did you only posted the link to it? A EVA with 7 diskgroups is far away from reality. If you create a diskgroup for any application, you lay a traditional RAID scheme over a system, that isn't designed for it. If you create more diskgroups it's clear that you waste a lot of space for the disk protection. But this kind of disk protection is much better, then traditional spare disks. Spare disks are a waste of space! If the EVA has enough space left in the disk group, it's using first the free space in the diskgroup, and then the reserved space (=space that is reserves with the disk protection setting). The occupancy level of 90% is to much, I often use 95% and it's safe. You should create as few diskgroups as possible. It's not big deal to run Oracle and Exchange within one diskgroup, also in large enviroments. It is obvious that fewer diskgroups with more disks are the best way. Fewer diskgroups > more disks in each diskgroup > more performance for the application > fewer space wastet due protection level.

So next time you better read the docs, instead of posting a link to it. ;)

Best regards.

Chuck Hollis

Hi Patrick

We now realize that 7 disk groups are perhaps excessive, but -- based on our interpretation regarding HP's recommendations and realistic concerns about performance and availability isolation, the right answer might be more like 3 or 4.

We'll be updating things shortly. Thanks.

Martin G

Just a quick post, I recently had some sizing done which was for pure capacity and how many spindles I would need to give me 400 Terabytes of useable space (Base 2 Terabytes) on a FAS. To give me 400 Terabytes of useable disk using 1 Terabyte drives would take 840 spindles made up of

689 data spindles
116 parity spindles
29 hot spares

coming to 834 spindles

which was 60 shelves of 14 disks.

This sizing was done by the NetApp account team. It was a pure capacity exercise and takes no account of the underlying performance requirements of the application, no account of Snapshots, no account of dedupe but simply how much data can I get on an array.

I wonder if the other vendors following this could carry out a similar exercise. 400TB of data on 1 Terabyte drives and I have an availability requirement of five nines (so if you think RAID-5 on 1 Terabyte drives is going to cut it...I might look askance).

The comments to this entry are closed.

Chuck Hollis


  • Chuck Hollis
    SVP, Oracle Converged Infrastructure Systems
    @chuckhollis

    Chuck now works for Oracle, and is now deeply embroiled in IT infrastructure.

    Previously, he was with VMware for 2 years, and EMC for 18 years before that, most of them great.

    He enjoys speaking to customer and industry audiences about a variety of technology topics, and -- of course -- enjoys blogging.

    Chuck lives in Vero Beach, FL with his wife and four dogs when he's not traveling. In his spare time, Chuck is working on his second career as an aging rock musician.

    Warning: do not ever buy him a drink when there is a piano nearby.

    Note: these are my personal views, and aren't reviewed or approved by my employer.
Enter your Email:
Preview | Powered by FeedBlitz

General Housekeeping

  • Frequency of Updates
    I try and write something new 1-2 times per week; less if I'm travelling, more if I'm in the office. Hopefully you'll find the frequency about right!
  • Comments and Feedback
    All courteous comments welcome. TypePad occasionally puts comments into the spam folder, but I'll fish them out. Thanks!