The recent SideKick brouhaha has been an object lesson putting a sharp spotlight on this topic.
Much has been said on this topic, so I won't recap the obvious. The online service had a really bad day.
Much finger pointing, lessons all around.
However, as we debate various cloud models, a key aspect between public and private clouds has been highlighted here, and that's around notions of control, transparency and accountability.
Simply put, enterprise IT is always accountable for what happens -- whether it's in the data center, or in the cloud somewhere.
Much has been said on this topic, so I won't recap the obvious. The online service had a really bad day.
Much finger pointing, lessons all around.
However, as we debate various cloud models, a key aspect between public and private clouds has been highlighted here, and that's around notions of control, transparency and accountability.
Simply put, enterprise IT is always accountable for what happens -- whether it's in the data center, or in the cloud somewhere.
The Quick Recap
I really don't want to replay the story, but I have to. During an ostensibly routine infrastructure upgrade, a lot of personal data was lost.And there was no ready backup at hand.
Are you cringing yet? Most IT professionals wince when they hear stories like this.
Maybe there was a backup routine in place. Maybe it didn't get done. Maybe the backup wasn't usable. We may never know.
Now, Shift To Enterprise IT
Within a data center, IT operations is responsible to make sure that (a) backups get religiously done per agreed policy, and (b) these are usable on short notice if needed, especially if a lot of bad stuff is happening.
Doesn't matter whether the application runs in the data center, or in a service provider -- IT is still responsible for the end result. Period.
Is it acceptable to blindly accept that your service provider is doing this as agreed? Perhaps not ...
The Importance Of Transparency
If I was an IT administrator using an external service provider for my important applications, I'd like to be able to externally monitor that backups were getting done as agreed -- using my own tools.
I'd like to be able to independently verify that the backups are usable -- using my own tools.
And I'd like to verify that someone hadn't made a bonehead configuration mistake like having the backup target data on the exact same storage array as the source data, such is the case with most kinds of snaps. That ain't a real backup, in my book.
I'd be OK with the service provider actually doing the backup work on my behalf -- as long as I had complete transparency into what was being done, and how it was being accomplished.
Not in a high-level "trust me" kind of way, but in a manner where I could directly observe the low-level detail if needed.
That's what I can do when applications run in my data center; that's what I would expect if they ran in an external service provider.
Now, I could choose to ignore all that detail, if I wanted to. But it'd be there if I thought there was a concern.
It's Not Just Backup
I think this line of "transparency" thinking can be extended to just about every other discipline where IT is held accountable: security, performance, compliance, licensing, etc.
The "trust me" default relationship we see in so many cloud and service provider models just won't cut it for many enterprise IT organizations that trust their business to the cloud.
This is stuff you can get fired over if it all goes bad.
IT will need to be able to probe, audit, interrogate, monitor, etc. IT operations in much the same manner as they do with their infrastructure today.
IT Control and Private Clouds
One of the key concepts of private clouds is the notion of control: IT has the option to remain in control, and not the service provider.
Sure, the service provider is responsible for whatever they've committed to -- but IT is capable of monitoring that the work is being done, and being done correctly.
Trust but verify.
Implications For Management And Security Frameworks
This implies a certain level of architectural thinking around "control planes" for fully virtualized environments where some pieces might be run internally, and some externally.
We'll need to think in terms of federated management models that allow the "rented" part of our infrastructures to be managed, monitored, inspected, etc. -- regardless of physical location.
And we'll need federated security frameworks that do much the same thing.
Not to do a blatant product dive, but I would argue persuasively that you'll see these exact same themes in EMC's Ionix and RSA portfolios respectively: federated management, and federated security.
In addition, we'll also need to see service providers who are willing to "open up" virtualized portions of their infrastructure to be under the control of the enterprise IT organizations that they serve as customers.
Frankly, not a lot of those in the market today -- but I'm betting we'll see more of these kinds of offerings before long.
The Bottom Line
Enterprise IT organizations won't be big users of any cloud model unless they can trust it. And, if they're experienced IT operators, they won't trust what they can't see.
Is transparency the new table stakes for service providers who want a piece of the enterprise IT market?
We'll see ...
Chuck:
Great observations about trust in the cloud. I would also submit that CSC's Trusted Cloud Orchestration strategy is another viable means of achieving the desired degree of visibility into the operation of cloud services. Our approach is documented here:
http://assets1.csc.com/cloud/downloads/Knode_digital_trust_in_the_cloud_final_090709b.pdf
The author Ron Knode is active in cloud standards development with a particular emphasis on IT Security (although CSC's cloud ourchestration vision is a superset of more than just IT security, audit and compliance functions).
CSC's vision of this "Trusted Cloud" is summarized as follows:
"What we desire is nothing less than a trusted cloud – i.e., a cloud that harmonizes the security for transactions and data with comprehensive transparency of control and result, such that it conveys evidence-based confidence that systems within its environment operate as advertised, and that no unadvertised functions are occurring."
It is a pretty cool vision and I am heartened by our progress in instantiating key components of it as we develop our Orchestration Core.
Thanks again for an interesting and timely posting.
~Randy
Posted by: twitter.com/randydarthur | October 15, 2009 at 07:42 PM
I believe the Sidekick debacle is clear and evident proof of the need for an industry agreed audit/certification and compliance process in order for companies to offer "Cloud" based services. I discussed how this might be approached in my backup blog at http://bit.ly/4xInTt
Posted by: Preston de Guise | October 17, 2009 at 03:40 AM
Chuck, I've been part of a cloud sausage making operation at a top 5 server company, and I wish you all the luck in the world with your control planes. But I seriously doubt you'll make much headway. Until then, I'm sticking to steak, or chicken, or even a pot of beans if I have to, but I've sworn off sausage for all time.
Posted by: puff65537.livejournal.com | October 17, 2009 at 03:59 PM
I have a friend who used to work for some big outsourced tape company early this decade, I think they were called "Storage Networks", the company is long since defunct. But they were big enough to the point I recall him mentioning they had a dedicated DS3 between their facility and Microsoft for the backups for example. They had lots of really big libraries), lots of high end gear, lots of big SLAs etc.
He said the back end was just a disaster, they were lucky if they could get 50% of their mandated backups done at any point in time, they routinely could not restore data that customers asked for, even their big customers like MS.
I don't know the precise reasons what drove them under....
Two jobs ago we used a company to off site tapes to a vault of some kind. It's a good company, mostly reliable. Though you had to stay on top of them, they were caught many times not picking up the tapes at all. We used the same company again at my last job with similar results though not as frequent. Not sure what the deal is, but if you stay on top of them they do get it done. Planning on using the same company this time round too....
At the same company, two jobs ago at least in the early days our "backup" process for Oracle included invalidating the log files on a nightly basis. So while we could get the data back, if we actually wanted to play a log against a production DB there was no chance in that.
Several jobs ago(laid off in 2002 from this gig) I setup a tape rotation scheme and off site data replication between three offices. So data was always stored at two different sites. All using pretty basic tools(rsync for replication, BRU for tape backup). I was told after the fact that my manager at the time on several occasions deleted data and asked me to restore it just to see if I could do it. Usually I could. There were of course times that I could not(data was created & deleted before a backup ran or something).
You said it Chuck, I want to be able to verify the backups are getting done with my tools.
In the process right now of finalizing my current company's off site backup scheme. Our data layout is so complex that no off the shelf software will work right, I have to gather the data from the various sources and stage it for the tape software, as we don't want to backup full volumes, we only want a small subset of our data(~1.5TB compressed vs ~80TB on the array). De-dupe doesn't work (well) for our data set either.
Posted by: nate | October 19, 2009 at 07:59 PM