I don't know about you, but I tend to study advertising intently.
In the consumer world, I was mightily impressed by the recent spate of "Old Spice" commercials. So was my wife -- she even bought me some Old Spice aftershave, perhaps with the hope of a magical transformation that didn't occur :-)
In the B2B space, my current hero is the UPS "That's Logistics" TV spot. They bring to light a vitally important -- but largely invisible -- infrastructure function that so much of our physical economy depends on. And it's a catchy tune as well :-)
I don't think it'll be too long before we collectively find ourselves focusing on the digital equivalent -- information logistics.
The Importance Of Moving Information Around
Broadly speaking, I think there are three major motivations why people want to move information from here to there.
First, there's cost optimization: this could be moving information to a lower cost provider, a technology refresh, cheaper media, and so on. The more information, the more compelling is the potential cost optimization.
Second, there's risk mitigation: having multiple copies of data, separated by distance, minimizes the risk of data loss or corruption. The more valuable your data is, the more compelling is the appeal of risk mitigation.
Third, there's improved user experience: having information (and application) closer to a user reduces network latency and potential bandwidth issues. The more responsive your user experience has to be, the more important this becomes.
There are three downsides to moving information around as well.
First, there are clear and well understood costs associated with moving information around. Whether it's bigger pipes, bigger storage or a smart caching device so you need less of the other two -- there's no getting around the fact that there are costs associated with moving information -- just like there are costs associated with moving things in the physical logistics world.
Second, the volumes of information at play can be staggering. Terabytes are the norm, and people are planning for petabytes. The long-term rate of information growth appears to be greatly exceeding the long-term rate of bandwidth growth, causing the potential for an interesting forcing function.
Third, there are inherent risks involved in moving information around: security, compliance, service levels, etc. Just like in the physical world.
Analogies From The Physical World
My only deep dive into the world of physical logistics came over a decade ago -- I was exposed to the planning process EMC uses to move spare parts around the world. To put it mildly, it was pretty sophisticated back then, and is probably much more so these days.
If you think about it, when a part breaks, you want to have a replacement available as quickly as possible -- the quicker, the better. But not all parts break equally often. And some parts are more important than others.
As our customer footprint grew, and we had more storage platforms, things got more complex.
There are clear costs associated with storing parts close to customers (think depots and warehouses with inventory that needs to be continually refreshed) as well as transportation costs and associated latencies.
There are risks as well: shipments go missing, a needed part might not be available, or down-rev in some fashion -- and in some parts of the world there are strict government controls about moving technology (especially storage devices) across borders.
Now, scale that for a multi-billion dollar storage business that tends to cater to customers globally who take their IT very seriously indeed. You can see the inherent optimization challenge.
Not a simple spreadsheet, folks.
And Onwards To The Cloud
It's hard to have any serious cloud discussion -- public, private, etc. -- without considering some form of information logistics. It's one thing when we're discussing cute little applets with a few megabytes of information attached. It's another thing entirely when we're talking about serious applications and serious information repositories.
Although there's some cool enabling technologies that help in various ways (I'm thinking VPLEX and Atmos in the EMC portfolio), those are essentially the transports, and not the policy setters.
Atmos' object-based model is an excellent example -- it can support incredibly rich policies that can dynamically and geographically balance between cost, performance and risk mitigation.
But someone has to figure out what those policies need to be -- and that's where I think we're going to have some challenges.
Simply put, Atmos can "read the bar code" on the information object and figure out what needs to be done -- but who defines what the bar code might mean?
Balancing Between Proactive and Reactive
Demand patterns are notoriously difficult to forecast -- in both the physical and digital world. It's easy to throw up your hands and say "well, we'll let usage patterns drive on-demand logistics".
The digital example might be someone at a remote location starting to bang on a set of large digital objects. The policy engine figures out that the end user is getting miserable response times, and initiates a temporary copy closer to the user.
The scenario is essentially the same if we were trying to move a workload from *here* to *there* over a meaningful distance. Performance is miserable until both the application and its information are both moved.
That could work in some scenarios, but -- let's face it -- the user (or application) gets a miserable experience until the situation is detected, and the local temporary copy is made. From a user experience perspective, the damage has been done. Far better if we were smart enough to get a local copy in place *before* it was needed.
And that is the hard part of logistics in both the physical and digital world -- getting something in the right place *before* it's actually needed.
Know Your Users, Know Your Data
In the physical logistics world, there are powerful applications that help organization manage and optimize their distribution and supply chains. It's not unreasonable to expect that -- over time -- we'll see the same sort of thing in the digital domain.
But -- as powerful as these applications might be -- they simply amplify and extend organizational knowledge about how things shoudl be done. They don't eliminate the need for this sort of internal knowledge.
Bridging from the physical to the digital world, the same scenario will likely be true as well. There will need to be people in the organization who understand the enterprise information base -- what it is, how it's used, and where it needs to be to extract the maximum value at the minimum cost and risk.
Will we see job descriptions for "information logistics expert" in the future?
Chuck, this post got top reviews in Twitter from @SteveDupe and @Valb00. I concur. Nice piece of writing here.
Posted by: marc farley | October 22, 2010 at 11:57 AM
One of your better posts Chuck. You're definitely one of the few people in the storage industry that I feel genuinely understands the challenges of information management...many in the storage industry do not.
In a sense the "information logistics" expert already exists. It's the logistics specialists' role, after all, to identify and find ways to obtain the data necessary to keep the trains running on time...before the trains leave the station. Truly optimal physical logistics isn't possible without the support of information logistics. Which is why the field is simply referred to as logistics.
The bridge you describe from the physical world to the digital world exists. The challenge is to apply it to other areas of business outside the logistics department...and with that I agree 100%.
Posted by: josephmartins | October 22, 2010 at 12:15 PM
Chuck,
Great post, catchy tune. It seems like almost a turn back in time to something akin to IBM MVS JCL that stages data in place prior to executing an application. The amazing thing is that cloud computing is getting to the same place only 30 years later. But batch died years ago?
There are some open source tool sets (see Condor Project at University of Wisconsin) that attempts to do this for big data/big computing. Moving data into place so that execution can take place.
Do you think cloud computing is bringing back the batch paradigm?
I think there's a blog post here...
Ray
Posted by: Ray Lucchesi | October 22, 2010 at 01:06 PM
Chuck,
Check out this relevant post from our old pal, Mike Kilian. His post is a research overlap with your ruminations.
http://mfktech.wordpress.com/2010/10/12/better-mobile-video-delivery/
Posted by: Steve Todd | October 22, 2010 at 02:50 PM
Chuck,
I am interested in discussing this idea with you in more detail. I am working for the US Navy to help them understand how and why this is important to them. your posting here is just about the ONLY think out there on this subject.
thanks,
John
Posted by: John | October 04, 2011 at 11:35 AM
Hi John
Yes, I'm very aware why this particular topic might be interesting to the Navy -- it's come up before. If you're interested, please drop me a line at chuck dot hollis at emc dot com, and we can maybe find some time to chat before too long?
-- Chuck
Posted by: Chuck Hollis | October 04, 2011 at 11:39 AM
I strongly agree that information flow is the only way to streamline physical logistics and material flow.
Posted by: Ben Benjabutr | January 24, 2012 at 08:58 PM