« What CIOs Really Want To Know About Cloud | Main | More Friday Ranting »

February 03, 2011

Comments

Mike Richardson

Chuck, I think you need to open your eyes a bit to see where the future of data protection is going. Full disclosure: I work for CommVault.

1) There are far more data failure scenarios than just array failure. The VAST majority of failures are either user induced (file deleted accidentally) application failure (database bug) or malicious attacks (viruses). Snapshots protect against all of these.

2) Array failures are extremely rare and are easily covered with replication. In my five years working as a Professional Services Consultant at NetApp before I joined CV, I never once saw an array failure that compromised data or snapshots. Perhaps EMC sees them more often.

3) Recoveries from snapshots are near instantaneous. How long would it take to rehydrate and recover 100TB from a DataDomain? What, no published restore speeds? Copy-back restores seem solid until you have time to watch the entire series of LOST several times before your restore completes. What good is a restore if your business dies before it completes?

Snapshots and replication provide an excellent recovery tier for data protection. Fast backups, fast restores and array protection through replication. Copying data back as though it were 1980 doesn’t work today and surely won’t scale in the future.

There will be a place for copy-out data protection, but it will be for long-term compliance restores, where the wait is acceptable, from highly compressed and deduplicated repositories. Primary storage is too expensive to maintain years of snaps.

It’s all about using the right data protection method for the right problem. CommVault makes this easy. If my business depended on it, I would much rather recover from a snap. My customers agree.

SRJ

Agree wholeheartedly with you main point. Semantics are sometimes silly to debate, but you got the point across. Is a local snap a backup? Well, yes and no...there's nuance to the answer as is almost always the case in IT.

The better solution:

Make heavy use of those local snaps, then replicate all those snaps to a separate physical box...keep 'em in both places if you like. Best to do that in a super-efficient manner...a la NetApp SnapMirror and/or SnapVault.

Chuck Hollis

Mike -- thanks for the comment, and the disclosure.

If you're familiar with EMC's portfolio, you probably are aware that we've got boatloads of snap technologies in our portfolio. It's a tool that's very familiar to all of us. If a customer's use case dictates a snap-based approach (either alone or in combination with other technologies), we do that.

However, we're fortunate in not having to take the "one size fits all" position that you're espousing.

For all you following along at home, Mike illustrates many of the points made in my post.

Including the unsaid "caveat emptor".

-- Chuck

Chuck Hollis

SRJ -- thanks for the comment. If I were you, I'd resist the temptation to pimp your products on competitors' blogs.

Very poor form, in my opinion.

-- Chuck

David A. Chapa

Chuck: Good blog post, for many years it is something I've talked with customers about - what is a backup?

What I think is important is not to blur the lines between what is an operation or a process and what is a best practice.

What you're asking in this blog is a great question. As you may or may not know I've been in backup/recovery for a long time now, over 20 years now and I would say that the process of making a copy (as you describe in your blog) does constitute a backup.

Many reputable sources, including SNIA and Disaster Recovery Journal support similar definitions of a backup. I've included SNIA's here:

------
backup

1. [Data Recovery] A collection of data stored on (usually removable) non-volatile storage media for purposes of recovery in case the original copy of data is lost or becomes inaccessible; also called a backup copy.

To be useful for recovery, a backup must be made by copying the source data image when it is in a consistent state.
------

So by the SNIA definition, a copy to a folder on your laptop that you call "backup copies" constitutes a bakup.

That defines the process or operation - if I were a consultant and was asked by my client if this would be an acceptable mode of operation to support their recovery requirements, I would have to answer that it depends on their tolerance of risk - how much are they willing to accept?

There's an entire line of thinking I would take customers down as they asses their data protection strategy but suffice it to say the following three items form the basis by which that question may be answered.

1. Application data business value
2. The impact to business should this data, app or system be unavailable for x # hours (a pre-determined time)
3. Tolerance for downtime (how much risk the organization is willing to take on)

Truthfully, many companies who are serious about their data protection plans go much deeper than what I've just touched upon and they go much further with their data protection than just making a copy to their local disk. They would either replicate (via host based or array based approaches) those local copies to another repository or leverage a backup application to move these local copies to some other medium other than the local disk.

So to answer the basic question of what your blog poses, I would say that yes that is a backup.

What are Backup Best Practices? That's really the question.

Great thought provoking blog...

David A. Chapa
Sr. Analyst, ESG

Chuck Hollis

David

Thanks for the commentary. And, like you, I can't quibble with the SNIA definition. It is precise, but is it useful?

More to your point, there might be a need to come back and remind people what those backup best practices might be.

Maybe it's a generational thing, but there are many lessons around this important discipline that have been learned over the decades, and perhaps the new practitioners haven't always had the opportunity to learn from the past.

Plenty of fertile ground for you there at ESG!

-- Chuck

Paul P

Hi Chuck,

You are making an invalid point,
It's like FUD - you are making this up...

The fact that NetApp's replication and vaulting tools are a popular inclusion in customer sites and included in any standard presentation to the end user, denies your claim. Checking the NetApp web site, reviewing backup under data protection, demonstrates this.

Additionally, I've never seen nor heard of any customer (NetApp user or otherwise) remove a backup system just for the sake of local snapshots...

Local snapshots have only ever been promoted as an additional layer of protection or enhancement to the backup, not too disimilar to disk in D-D-T solutions.

Now you can argue if you want that local snapshots are not a backup, that's fine, but the restore process has essentially always been part of/associated with the backup process. Neither can operate without the other.

Back to my original point, do you have any publicly available, NetApp published material that states that: A) The local snapshot is THE backup, or backups/copies are not required when using local snapshots? Because I've never seen it, NetApp customers have never seen it, left me wondering - who has, apart from yourself?

Disclosure - I work in the storage industry but do not currently represent NetApp products.

Dan Leary

Hi Chuck,

Disclosure: I work for Nimble Storage.

Interesting blog - I think this is a topic worthy of further discussion within the community.

Snapshots, when available, are an ideal way to quickly restore data in the most common recovery scenarios. Replicating data to another offsite array is an important component of the strategy to protect against array or site failures, which are fortunately much more rare. But I think you glossed over one a key point: there are many significant benefits from this approach, as Mike described above.

And it's not a one size fits all solution: copy-based backups often still play a role in data protection. But if the cost of snapshots can be dramatically reduced so that weeks or months of recovery points are available, it opens up a whole new world of possibilities for organizations looking to simplify their data protection & recovery needs.

--Dan

Chuck Hollis

Paul P:

Please go re-read my post, and make sure there's at least some sort of logical connection between what I write and your comments. I didn't mention a single vendor by name in my post.

Consider switching to decaf.

-- Chuck

Chuck Hollis

Dan:

Please go re-read the post more carefully. As stated before, local snaps are a useful tool. But -- by themselves -- I think they are sometimes oversold as a "backup solution".

-- Chuck

dominic cody

Chuck,

Great post as usual and a very thought provoking topic from the comments so far. Although it does appear like you have said that some either haven't read the post correctly/misunderstood or are just promoting a vendor with their comments.

Disclosure - I work for HP currently but am from an end user background 10 years as a Storage Architect.

Totally agree with your comments, people always seem to get confused between a copy and a backup. Another point for confusion is backup and archive but I won't go there today.

Timefinder - EMC
Snapmirror - NetApp
Backup to disk on the same array as the data or even same disk group which I have seen people do.

These are copies and are not backups. If you replicate snaps or disk backup images to a 2nd array and site then this gives you more comfort. But is it still a copy, yes it is a copy of a copy for speedy recovery. Is it for a DR scenario say you lose your Array? Definitely not.

You always need a backup never rely on a copy. I would never stand up a timefinder/snapmirror solution for a customer and tell them it will meet their RTO/RPO on their own.

I may say a 1st hit would be the copy but then at some point that copy has to either go to tape or a disk unit (data domain- hell there I go again product naming a non HP product, 2nd time this week on your blogs)

Does my comment make any sense hope so its late in UK and spent the evening in hospital with my 2yr old daughter so slightly tired.

Dominic

Chuck Hollis

Hi Dominic

I like your differentiation between "copy" and "backup". All backups are (point in time, consistent) copies, but not all copies are necessarily backups.

Thanks for the comment ...

-- Chuck

SRJ

Chuck - poor form? Really? Was it not obvious that the two features I mentioned were only mentioned because you slyly chose to stay narrowly focused on your red herring argument to make it seem to readers that those "other" vendors (who can actually take advantage of snapshots on their gear) don't really care as much about the user's data as good 'ol EMC does?

I'm not pimping anything on your blog, thank you very much. My only purpose here is to call "BS" when I see it.

Unrelated question for you: You're always busy telling people who disagree with you to "lighten up" or something to that effect... Why do you stoop, then, to cheap insults like telling another commenter to "consider switching to decaf"? I don't get it... And I'm pimping a product just because I mention something relevant and germane to the topic at hand, and in direct response to an assertion you yourself made?? I even did it graciously, conceding that I agreed with your main point! I don't know man...maybe take some of your own advice?

Chuck Hollis

Hi SRJ -- this is your last chance to keep the discussion more civil, otherwise I'll just start deleting your commentary as inappropriate.

#1 -- most storage vendors do local snaps, including EMC. EMC, in particular, has been integrating local snaps into backup and DR workflows for over a decade. But we don't consider a local snap a "backup" for most situations.

#2 -- a lot of people who work for competitors and just feel they just have to mention their product. It's annoying.

#3 -- you're free to comment on how I respond to comments. If you don't like it, don't read it.

Now that you've got me going, I understand the following to be true about the NetApp "backup solution".

I put the term in quotes, because I don't believe it's either (a) backup, or (b) a solution. Please correct me if I'm wrong in any regards.

* "backups" (i.e. local snaps) on a local array are limited to 255. If more are required, they must be moved to a different array.

* should someone do a restore from a mid-point (e.g. a point in time recovery), all snaps (backups) from that point forward are invalid and essentially useless.

* there is no cataloging or workflow software. Admins are responsible for remembering what is what, and where is where. Sort of like using the "tar" utility to do backup.

* NetApp charge$ for specific software to restore data, as well as charge$ for software when more than one array is needed.

* achieving short RPO/RTO is tough, since all data has to wait for dedupe before it can be moved or replicated off the primary array.

* if the customer is already using backup software (e.g. Tivoli, Networker, etc.) that software has no knowledge of where the snaps are, how they're cataloged, restore process, etc.

There's more, but I just wanted to share what I had heard, and make sure I wasn't off base.

Thanks

W. Curtis Preston

I have to disagree with a few comments. All copies made for the purpose of restore are indeed backups. Just because you didn't take it offsite doesn't make it not a backup. Yes, you should ALSO have a copy on different storage in a different location. But that doesn't stop the first copy from being a backup. But anyone who ONLY has that kind of backup is an idiot. ;)

@Chuck

You're seriously trying to say that this post wasn't aimed squarely at NetApp? They're EMC's only large disk competitor selling the idea of snaps as a backup.

No vendor is promoting what you're saying is bad (the idea that you can have only a local copy and be OK). Your suggestion that local copies being deletable is also not valid from any snapshot-based storage vendor. They're read only for all but the high and mighty admin.

As to your comments about NetApp's solutions in the previous comment, I will only comment on the ones that I'm sure are incorrect. (There are some others that I'm PRETTY sure are incorrect, but I don't want to create more misinformation.) I do this only to set the record straight, not to defend NetApp. That's not my job, although it seems to happen a lot when I comment on your blog. ;) Remember I don't work for them or any other storage vendor.

* There is no catalog because you don't need one. The purpose of a catalog in backup software is to allow you to browse and search against what's on tape. You can do that with any snapshot solution without needing a catalog. If you want more than that, you have two options. You can use any indexing solution that works with a fileystem; there are many. OR you can see my last answer below about using your backup software to do this.

* NetApp only charges extra to restore in a fancy way, such as doing magic, behind-the-scenes moving of changed blocks to make a really fast restore. Basic restores (copying good copy of a file over a bad copy of a file, like what you would do with backup software) are free. As to the multi-array thing, last time I checked EMC charged per array for their software, too.

* Data does not have to wait to be deduped before it is replicated. You can do dedupe before/after/during replication. Doing it before would reduce the amount of data you would have to replicate and doing it during could hurt your performance. Doing it after is fine too. You just end up replicating standard changed deltas and then deduping them on the other side.

* Most major backup products (including NetWorker, I believe) have been able to control/manage AND catalog what's on snapshots via NDMP.

I again say that people representing vendor A should not commment about how vendor B works. Rarely do they have their facts straight. This happened back when EMC guys used to bash Data Domain on your blogs. Then once you bought them you found out that you were all wrong about how they worked.

Chuck Hollis

Hi Curtis

Generally speaking, I think you're an OK guy, but when you start pontificating ...

This post was not aimed at any specific vendor. Any correlation between the content here and specific vendor(s) is solely at the discretion of the reader. Certain vendors chose to jump in, though.

Specifically to your points:

"There is no catalog because you don't need one". You're right, technically. But the vast majority of moderately complex environments would benefit greatly from a catalog with specific workflows built around it.

I think we'd both agree that it's the efficiency and effectiveness of the workflow process that really matters at the end of the day, rather than how it's specifically implemented.

"Data does not have to wait to be deduped before it's replicated". Theoretically, of course. But, as you know, there are specific implementations out there where that's required, and -- if these vendors are using a post-dedupe process -- there's inevitably a delay involved.

As far as vendor A bashing vendor B and not having facts correct, I could also argue the same around "independent consultant" A and vendor B. Rarely is the discussion black and white from *any* source, there are always nuances and opinions in play.

At the end of the day, we might both agree that more transparency around tradeoffs and risks between different approaches is good for customers and the industry at large.

And that's what I'm shooting for.

-- Chuck

W. Curtis Preston

I think you're an OK guy too, but when you start talking about competing solutions....

I think my only pontification was that you (and everyone else like you) should never speak in your blog or on stage about how a competitor's product works -- ever. If that makes me not an OK guy, then I'm fine with it. FWIW, everyone I know who ISN'T a vendor rep (and many who are) would agree with that statement.

Yes I agree that a catalog can be helpful, and I told you how to get one.

No, I've never encountered a situation where dedupe of primary data is necessary before replication, and for the life of me cannot think of a scenario where that would be the case.

I'm not going to say that I can expound upon the virtues of any vendor better than they can themselves, but... The difference between vendor A talking about vendor B and me doing the same (I can't say for all) is that I've actually used vendor B's products in a customer. I've deployed and used NetWorker, Data Domain, and TimeFinder, so I can speak to how they work. And the same is true of NetApp's solutions in this case. But I only comment on what I know based on experience. This is why I did not comment on every one of your comments.

Mike Riley

Hi, Chuck,

Mike Riley from NetApp here. Good blog post. I do appreciate a hard look at what constitutes a good backup/DR practice for customers.

Since I work for NetApp and I appreciate Curtis' points on commenting on how a specific vendor's product works if you don't work for that vendor. Inevitably, we just get something wrong when we attempt to comment on the plumbing. So, a couple of NetApp plumbing nits:

* The snapshot limit of 255 is per volume, not per array. You can have 500 volumes per controller for a total of 127,500 snapshots/controller.

* Large customers (Telcos, Finance, Service Providers) keep 10,000 - 70,000 snapshots per controller and on average 80 - 230 snapshots per volume. The vast majority of these customers use a backup product to catalog these snapshots. The vast majority make use of SnapVault and/or Snapmirror to replicate these recovery points either intra or inter-controller.

* Smaller customers also use backup products to coordinate and catalog snaps but, with smaller installations you can also make use of simple and free indexing tools as Curtis points out.

* Single-file restores are free. Whole volume restores and application integration are not.

* No, you do not have to wait for dedupe to complete before replicating and it will not impact your RPO/RTO.

My opinion: snapshots are indeed backup images. (If I can recover from it, it's a backup image. A backup is only valuable as a function of your ability to restore from it). As an aside, if you have a device/array failure, you've moved from backup/recovery to a disaster recovery scenario. That's where off-site copies make sense.

What seems to be happening here is we're arguing that snapshots are not backups by arguing that they are not a complete backup/DR solution. Defining the details of a backup solution does not invalidate one of the details.

Chuck Hollis

Hi Mike:

Thanks for the corrections. It's hard to get accurate information about competitor's products sometimes, hence my disclaimer. I did note that you didn't address the point-in-time concern: using an intermediate snap invalidates all subsequent snaps -- is this a correct assertion?

I think we would both agree. Snaps make the whole process of data protection that much easier. But, by themselves, they rarely qualify as an enterprise-grade data protection solution.

-- Chuck

Paul Hargreaves

* "backups" (i.e. local snaps) on a local array are limited to 255. If more are required, they must be moved to a different array.

FlexClone the volume locally, then create hundreds more, rinse, repeat.

* should someone do a restore from a mid-point (e.g. a point in time recovery), all snaps (backups) from that point forward are invalid and essentially useless.

Depends on how you restore. For 99% of recoveries that are lun based or a few files, then no. The only time this is true is if you decide to roll back an entire volume, typically millions of NAS files, and that is because it is decided that it is quicker to do them all in one go rather than individually.

The comments to this entry are closed.

Chuck Hollis


  • Chuck Hollis
    SVP, Oracle Converged Infrastructure Systems
    @chuckhollis

    Chuck now works for Oracle, and is now deeply embroiled in IT infrastructure.

    Previously, he was with VMware for 2 years, and EMC for 18 years before that, most of them great.

    He enjoys speaking to customer and industry audiences about a variety of technology topics, and -- of course -- enjoys blogging.

    Chuck lives in Vero Beach, FL with his wife and four dogs when he's not traveling. In his spare time, Chuck is working on his second career as an aging rock musician.

    Warning: do not ever buy him a drink when there is a piano nearby.

    Note: these are my personal views, and aren't reviewed or approved by my employer.
Enter your Email:
Preview | Powered by FeedBlitz

General Housekeeping

  • Frequency of Updates
    I try and write something new 1-2 times per week; less if I'm travelling, more if I'm in the office. Hopefully you'll find the frequency about right!
  • Comments and Feedback
    All courteous comments welcome. TypePad occasionally puts comments into the spam folder, but I'll fish them out. Thanks!