No, I'm not talking about new legislation that might make for more work. I think we've seen enough of that, but we all know there's more coming.
I mean laws in the sense of Moore's Law (or Murphy's Law) -- inexorable guiding principles that will guide our thinking of information and information management over the next few years.
Today's post was triggered by an excellent post by Mark Lewis, in which he outlined 8 laws ("information laws") around what he calls "Information 2.0".
I don't know if he was looking for a discussion on his proposal, so apologies in advance if this was not the case ;-)
I wholeheartedly agree with the first premise -- we'll need some new architectural thinking in IT that shifts how we think about information. And the notion of guiding principles (e.g. "laws") works for me.
And, now that you mention it, I agree with the second premise that this changed mindset around information will probably need a new label, hence "Information 2.0". The evidence is overwhelming.
So, please allow me the opportunity to expound and expand on Mark's proposed laws.
First Law -- Information Decoupled From Applications
Couldn't agree more.
I see one of the most serious barriers to managing information intelligently that applications "own" the information they create. Any understanding of metadata, policy, security, etc. seems to be inexorably bound up in application logic (or lack thereof!).
It's not hard in a ten-minute conversation to convince someone that this application-defined approach to information management is a bad thing, from not only an operational perspective, but a strategic perspective.
It's kind of like convincing a baby boomer that they need a retirement plan.
And, although some application vendors are starting to open up various interfaces that allow some of this to be done, I am privately concerned that the tendency may be for application vendors to protect their walled garden, to the detriment of customers.
But there are alternate paths to this world.
One is external classification, e.g. tools that pick up a piece of information, and decide what it means, and what to do with it. Works well for files and emails (it's being done today), a bit more problematic with business transactions, but not impossible.
Another avenue is third-party tools that work closely with application logic. This only makes sense for the biggest application ecosystems, e.g. Exchange, SAP, Oracle Apps and so forth.
And, of course, in a SOA world, there's a nice information bus to interact with, so that's another avenue.
The bigger challenge here is mindset change, in my humble opinion. Just taking the simple yet fundamental step of thinking of information separate from the application seems to be huge for so many people.
And taking that perspective leads to all sorts of different decisions around applications, architecture and the like.
Second Law -- Information Accessible Via Web Services
I think I agree more with the spirit of this proposed law, rather than the actual words.
Mark's point (I think) is that information needs to be accessible in an unmediated fashion using a standard and simple protocol. I qould quibble that web services is but one way to do this (there are other approaches), but that's minor.
The other sticking point in my mind is that all that metadata (described later, especially as it applies to information security) has to be mediated by some sort of, but maybe that's more quibbling.
But I think that this core point flows from the previous one -- for information to be decoupled from applications, you'll need a standard access method to get to it.
And that part makes sense.
Third Law -- Information Metadata is Integrated With All Data
Several big ideas here, so let's step back a bit.
Most people realize that to effectively manage information, you're going to need metadata, and a lot of it. Separate challenges about creating it, separate challenges about using it effectively.
Mark posits that this metadata needs to be stored in an integrated fashion with the data itself, e.g. self-contained tagging.
Although I like the concept, I smash into the brutal world of technology reality. For one thing, we'll need new file and object formats that don't really exist today. And they'll probably need to be dynamically extensible as we figure out new forms of classification we might need in the future (ugh).
I would offer that, although Mark's proposed law is a desirable end-state, we'll probably be living with external metadata repositories for the forseeable future.
First, it's the only pragmatic approach with today's technology, and -- more importantly -- we'll have the time to figure out what metadata bits need to be bound to the file or object, and which ones really make sense to manage externally. I wouldn't even try to hazard a guess on that one today.
But it's a nice thought.
Fourth Law -- Information Security Is Explicit and Built In
Couldn't agree more. Mark's description is accurate -- information needs to learn how to protect itself. Think encryption, DRM, logging, etc.
One way of looking at this is as a special case supported by the previous three laws. If you've seperated information from application, used a standard access method to get at it, and have metadata associated with your information -- well, then, you're in a good position to tackle information security.
But I see us all spending the next few years on risk remdiation, rather than re-architecture. I think there will be enormous interest in finding and securing information that's in the wild and shouldn't be.
Like looking for files with sensitive stuff in them. Or putting real access control on internal applications.
I think the industry will then move on to the architectural concepts that make much of this after-the-fact remediation largely unnecessary. You can see many of the core technologies coming to market (DRM, repositories, policy managers, log analyzers) -- many of them in the EMC portfolio -- but I believe that it will take one or two application architecture cycles before the technology gets baked in, rather than added on.
Fifth Law -- Information Optimization Are Built In As Services
Mark's concepts refer mostly to storage related infrastructure -- performance, protection, cost-optimization, and so on. And he's right. And it seems to flow from the first three laws above.
But I think Mark stops a bit short of the idea's full potential.
He's mostly talking about infrastructure-related services -- but there's another aspect to "information optimization", and that's enhancing business value of information you already own.
Find some bit of customer information? Make sure that it's exposed to other business functions that could use it.
See an email stream on a cool project? File it away as part of a knowledge management concept.
I would argue, as long as you're collecting metadata, why stop short? So much information stored in companies today has additional business value, so let's add that to the concept.
Sixth Law -- Information Is Personalized
Wow, that one hit home.
Every now and then I have to go gather information across the EMC expanse, and we're talking weeks, folks, just to get a consistent picture to emerge. I can't imagine a more unproductive use of time.
For me, this ties together mutliple themes I like to write about -- the fact that we're changing into an economy of knowledge workers, the fact that most of our useful information doesn't live in databases any more, and so on.
Or, put differently, is Enterprise Search one of the killer apps of this decade?
For some companies I've met, they've already decided that the answer is "yes", and they're investing appropriately. And I think there's still far more innovation that can be done on the topic, just like web searches are evolving rapidly.
If I was feeling adventurous, I'd extend Mark's concept a bit and add ".. and collaborative" to the description. More and more of the work we do today is essentially collaborative in nature, with non-linear workflows.
It's not just my information, it's my view of the work processes I'm involved with.
Seventh Law -- Information Is Delivered in Both Real-Time and On Demand
OK, this one doesn't give me any significant insights, (sorry!) because I think I already live in this world for many types of information.
Sure, it could be better (Mark's gripes about the network vs. PVR court ruling not withstanding), but there are not many "aha's!" for me with this one.
My personal vote would be to drop this law, and replace it with an alternative one, suggested below.
Eighth Law -- Information Is Always Simply Available
Amen.
If we're becoming a society dependent on information, it better be available when and where we need it.
Ditto for drinking water, electric power, health care, cell phone coverage, and so on.
I think the "availability matrix" that some organizations go through when planning business recovery will look painfully outdated in just a few years.
And, going farther, I think we'll find that the cost benefits associated with segmenting out different classes of information and associated protection levels, versus the expediency of just protecting everything homogeneously will begin to tip for more and more organizations.
See interesting article illustrating this point here.
So, do we need another law?
I think so.
One that speaks to accountability for broader information management issues, which are very poorly defined in many organizations.
And I topic that I've written incessantly about.
I respectfully propose a new law -- Information Needs An Owner. Someone who looks out for the costs, value and risks associated with the entire information portfolio, rather than just isolated aspects of it.
Hopefully someone with an informationist perspective ...
Anyone else want to weigh in?
Hmmm. Is there anything in here about Information having value? Obviously some information bits are going to be more valuable than other bits. Whether that value derives from compliance needs (PCI, HIPAA, etc) or from ownership or just plain corporate need. If you lump this into metadata, then it should be one of the most important pieces of metadata since it drives the decisions for classifiction, protection and transferability.
Posted by: PlanetHeidi | April 05, 2007 at 07:11 PM
This discussion reminds me of the days in the 80's when Corporate Technical Architectures were the rage and we were all on a quest to define and build Enterprise Meta-data Dictionaries.
So what really has changed since then? We've done a better job (though no where near perfect) of defining information sources and a somewhat better job of amassing the information into different types of systems based on use cases (i.e. OLTP, Warehouse, OLAP). We still seem to let informational bandaids get created that pollute our information in terms of traceability, business value over time, and who the bottom line owner is.
I agree that we need some informational laws - but who is going to legislate (mandate) them (think about global requirements for example) and who is going to execute compliance?
One final thought - though not necessarily an information law - but I think there also needs to be a way to collect information about how information is used as well as what it is used for. By that I mean - let's say for example that I create a spreadsheet that creates some new derived data on market opportunity and I'm in the marketing group. The spreadsheet gets stored and now someone in engineering needs my data to create a proposal to develop a new/better product using my information as a component of their presentation. Since the original data is derived - how does the new use of the information - i.e. its original context get protected and conveyed when it is re-used and presented? Don't we need an information law that protects the information’s context and re-usability? and to take it a step further - perhaps evaluate over time the information’s use and project it future use based on usage patterns?
The analogy that I can think of in the "real world" is how new words get incorporated into a language such as "ain't" for example. When I was a kid - we were not supposed to use "ain't" and yet today it is considered acceptable due to common usage. So should we have a common usage "law" for information? Or should we protect our information from common usage? I believe information common usage has social implications too ... but that's another post :)
Posted by: Wayne Pauley | April 11, 2007 at 12:59 PM
Thank you for the very insightful (and extended) comment. You raise many good points that might be outside the scope of my response.
Although I agree data architects did a good job with structured data, I think it's all being undone with the rise of unstructured information. Specifically, once that well-classified and well-understood data field lands in someone's powerpoint, all hope is lost, at least from a data mgmt perspective.
I shudder to think what will happen when governments legislate IT practices, but that is exactly what appears to be happening, albeit gradually. I would like to see IT organizations self-govern in this regard, but it may not be achievable.
It should be noted that I meant "laws" in the sense of architectural guidance, and not punitive legislation, per se, but your point is still valid.
I really liked your suggestion about gathering information on how data is used. Not only to better understand its value, but -- ultimately -- to provide audit trails in regards to managing sensitive information. I don't know what to call it, though, as metadata doesn't really do the job. Usage data? Nominations are open ...
And, finally, my head spins too when I consider the social implications of common usage concepts for information. Especially if it's my personal information.
Thanks again, your comments are always welcome!
Posted by: Chuck Hollis | April 12, 2007 at 01:35 PM
Besides the points raised by Wayne, here are a couple of added thoughts I had.
First, the use of the term “Law” and comparison to an observed law such as “Moore’s” law is a little bold, excellent discussion points they are laws they are not, only time will tell.
Second, I like the idea of usage or audit tags within the metadata. If you combine this idea with the information gathering issue you have in the “Sixth Law” I think it needs to go a step further. The issue is often confirming conflicting information you get from different parts of the organisation or data quality. I always liken this to digital (data) to analogue (information) conversions. These conversions happen many times as data is passed among, and information derived from, the structured applications. This creates the realised variations to the original. Now an audit trail would let you get back to the source. But a family tree would let you understand where else and how that source data has been used within the organisation. This, to me, makes a strong case for the external metadata repository “Third Law”.
Posted by: TonyS | April 18, 2007 at 05:37 AM