« It's The Big Launch Day For EMC IT's ITaaS | Main | Building Your IT Factory »

December 20, 2011


Dave Vellante

Really meaty post Chuck. Great work.

Black box and no sampling (ie every grain of sand).

Now that's not too mind bending for most companies :-)

Cheoy Lee

Thanks for this really detailed yet straightforward article explaining what's perplexing to many people!


You have great work in this post. Thanks for the all deep information about data analytic.

Brook Reams


This caught my eye ...
"My colleague Bill Schmarzo covers the topic well in his post on "The Death Of Why" -- as long as you have a predictive model that works, it ends up being much less important to understand "why"."

How would you have confidence the predictive model "works"? A past pattern does not ensure a future one still exists. Ifall you are doing is finding correleations and then making decisions on them, well, you could loose $2 Billion in hedges.

Color me "skeptical", but interested.

Chuck Hollis


In ancient times, we had great predictive models about seasons, the sun rising, phases of the moon, etc. -- but without a good understanding of the solar system, heliocentricity. You knew what was going to happen next spring, but not exactly sure why.

That's the same point here. Big data predictive analytics often results in "black box" predictive models, but are not always great at explaining the "why". That's a different exercise altogether.

Predictive models are extremely useful when it comes to better pricing risk, as you'll always find in the investment, finance and insurance industries. But better predicting risk is not the same as eliminating risk, so there will always be bad bets to deal with.

-- Chuck

Tim Johnson

Confidence in the model emerges from validating its predictions against data that was not used to build it. Whilst traditional predictive tools such as regression create parameters that can be interpreted, they can still have problems generalising beyond the original data set.

All predictive models need to be validated against more data, as well as interpreted using domain expertise. And whilst black box models such as neural nets do not create explicit parameters, sensitivity analysis can be used to understand how the inputs change the outputs.

The comments to this entry are closed.

Chuck Hollis

  • Chuck Hollis
    SVP, Oracle Converged Infrastructure Systems

    Chuck now works for Oracle, and is now deeply embroiled in IT infrastructure.

    Previously, he was with VMware for 2 years, and EMC for 18 years before that, most of them great.

    He enjoys speaking to customer and industry audiences about a variety of technology topics, and -- of course -- enjoys blogging.

    Chuck lives in Vero Beach, FL with his wife and four dogs when he's not traveling. In his spare time, Chuck is working on his second career as an aging rock musician.

    Warning: do not ever buy him a drink when there is a piano nearby.

    Note: these are my personal views, and aren't reviewed or approved by my employer.
Enter your Email:
Preview | Powered by FeedBlitz

General Housekeeping

  • Frequency of Updates
    I try and write something new 1-2 times per week; less if I'm travelling, more if I'm in the office. Hopefully you'll find the frequency about right!
  • Comments and Feedback
    All courteous comments welcome. TypePad occasionally puts comments into the spam folder, but I'll fish them out. Thanks!