Many people are now aware that big data analytics is quickly becoming the next competitive ante: to improve existing business processes, create new ones -- even a foundation for entering entirely new businesses.
The race is now on to acquire -- and maximize the productivity of -- the key talent behind this wave: data scientists and their supporting data science teams.
At EMC, we've been working hard to understand who these people are, what makes them different, how they work -- and what they think is important.
Today, I'm pleased to share with all of you a few key highlights from a recent survey of data scientists and traditional BI analysts.
The work was conducted jointly with StrategyOne to get an "inside-out" view of how these people view the world, and -- especially -- how these data science folks are markedly different than the BI analysts that we're more familiar with.
Lots of useful nuggets here ...
The Basics
In most of my customer discussions, we're at sort of at an interesting plateau.
Business and IT leaders realize that developing big data analytics proficiency could mean a lot to their business. There's not a lot of focus on "making the case", it's more about "where can we get started?" and "what use case should we target first?".
These same business leaders are now inevitably curious about the new skill set: data scientists and the data science they practice.
The reference point for many is comparing and contrasting these new individuals with the more familiar business analysts that are scattered across most corporate landscapes. This isn't a judgement statement against or for business intelligence analysts per se, just a familiar starting point for exposing the key differences.
The survey is reasonable in methodology, as far as such studies go: decent reach and depth enough to draw some interesting conclusions.
The Headlines
First -- and not surprising -- is the acknowledgement by all participants that demand for this skill set has already exceeded supply, and this will likely continue into the future.
This is not exactly the first data point confirming this; but it's interesting to note that practitioners acknowledge this as well.
The "technology enablement" side is a rather new perspective: 83% of respondents think that the availability of new technologies will increase demand for the knowledgeable individuals who can harness their potential.
And, soberingly, only a small proportion believe that current BI professionals can fill this gap.
There is clear agreement that "data science" (however you choose to define it) is a fundamentally different profession with a different profilet than the BI analysts that came before it. They're more likely to have advanced degrees, frequently have a background in the sciences (vs. business) and they interact with data in more ways -- and using different tools.
I feel a bit justified with my honest assessment: data science is not your father's BI
Finally, it's clear that data scientists are essentially "data experimenters" vs. rote analysis -- and they're likely to be interacting with IT functions in far more positive ways than the norm.
What Else Is Holding These People Back?
I found this result illuminating, because it reflects my experience in talking to people who are investing in this area.
Note the responses. Some are familiar -- things like more budget and resources.
But consider that a third of practitioners will point to needed skills and expertise outside their own function -- a "general proficiency" requirement we're trying to address with new EMC educational offerings such as this.
I found the "wrong organizational structure" and "insufficient executive support" as two sides of the same coin: about a third of practitioners don't feel their company is organized for success. Of course, that probably has something to do with both the "lack of resources" and "lack of broad-based skills" observations.
Like anything else meaningful, you've got to organize for success.
I've made a bit of a practice around sharing with interested parties the patterns we've found in how proficient data science teams are organized (internally and externally aligned) as well as the interesting journeys of how these proficient functions came to be.
Data Scientists: Better Educated -- And Broader
These next two charts tell a useful story.
The first juxtaposes the educational profile of the BI analyst and that of the self-identified data science professional. Note the bias towards advanced degrees with the latter.
This corroborates my own experience -- I recently met one fascinating gentleman who had three PhDs in seemingly unrelated fields. You'll always find a strong sense of intellectual curiousty coupled with "show me the data" skepticism in this crowd.
Perhaps more interesting is the educational profile behind the advanced degrees -- data science professionals are twice as likely to have come from an analytically-intensive scientific field vs. the normal business background of the BI analyst.
In my mind, the "science" part of data science is abundantly clear: many of these people are scientists in their own right, and are quite comformatble applying data-driven scientific methods to different fields of pursuit -- like understanding consumer behavior.
Data Scientists Touch Data Across The Entire Data Lifecycle
Another interesting finding to consider.
While many traditional BI analysts are simply functionaries in a larger information-gathering-and-analysis supply chain, the precise opposite seems to be true with data science professionals.
As you can see here, they're involved from everthing from sourcing new data sets (usually from outside the company!) to telling data-driven stories to business stakeholders with the intent of positive change.
This realization inevitably leads to a discussion around tools and platforms that help them do all of this using a single set of tools integrated around their particular workflows (stay tuned for more on this very soon!).
If (a) data scientists are scarce, valuable talent, and (b) they have incredibly broad reach across the data lifecycle, then it logically makes sense to invest in tools and platforms that help them do what they do better -- and with less effort.
That's the way we're looking at it, anyway.
... And Work Across The Organization As Well
Yes, one more interesting finding to ponder.
In many situations, data scientists find themselves working across the entire organization (in addition to other data scientists, of course).
Look at some of the roles they say they work with frequently: graphic designers, HR professionals, marketing, sales, etc. -- clearly not just technological professions.
One of the questions I often get asked is "is this an IT function?".
While IT clearly has a role to play in enabling data science, this slide makes it pretty clear that data science is a business function, and not a functional support role.
Additionally, when we consider traditional BI analysts, many of them are embedded in one or another functional group: sales, manufacturing, marketing, etc.
This is clearly a different beast.
What Are They Interested In Learning More About?
OK, I wasn't expecting this, but -- upon further consideration -- it makes a certain sense.
When these data science professionals were asked "what would you like to learn more about?" the top two answers were data storage and cloud computing.
While I'm obviously pleased that the survey points to two of EMC's core strengths, it took me a moment to realize that raw resources are very much on these people's minds: more storage, more compute.
Perhaps that has something to do with the "we need more resources" observation above :)
And Where Do They Like To Work?
OK, this is going to be a tough one for many business leaders.
The overwhelming number of data science respondents prefer to work in ostensibly smaller settings. Think focused teams, collegial work environment, easy to navigate the organization, and so on.
When you consider that most larger enterprises that could enormously benefit from applying predictive data science techniques are somewhat larger in size than the buckets given here, you're immediately struck by an interesting management challenge.
The structure, isolation and inflexibility of most corporate environments appears to be something they're not warmly embracing ...
How do you create a smaller, well-resourced setting in a much larger environment that will attract the key talent you'll need?
And that -- above all else -- will probably end up being the magic key to learning to compete through predictive analytics.
Lovely post and interpretation of the data, Chuck. Thank you. To your final, very smart observation and question: Perhaps, like physicians and medical specialists, the data scientists will take up residence in their own kinds of data clinics, so to speak, treating individual businesses like patients? It makes complete sense that anyone with such far-reaching curiosity, training, and intelligence would not aspire to life in the corporate cubes!
Posted by: Andersonb123 | December 05, 2011 at 01:03 PM
I think you're right. We'll undoubtedly see boutique consultancies spring up, possibly aligned by vertical, that create the sort of environment that advanced practitioners want.
At the same time, don't entirely count out large organizations who'll need some of this capability in house.
Back to your medical analogy, there are plenty of doctors who work in large hospitals :)
-- Chuck
Posted by: Chuck Hollis | December 05, 2011 at 01:56 PM
Thanks for publishing the study. Really interesting information. I'd be interested to get your thoughts on the value that data visualization brings to the table as a form of discovery and exploration.
Posted by: Ben Hosken | December 05, 2011 at 09:32 PM
Interesting information, but a good data scientist would discourage the use of 3D pie charts =)
Posted by: Matthew Grogan | December 16, 2011 at 07:11 AM
Indeed!
Posted by: Chuck Hollis | December 16, 2011 at 10:48 AM
http://mashable.com/2012/01/13/career-of-the-future-data-scientist-infographic/
Posted by: Chad | January 17, 2012 at 04:39 PM
Nerds rock! Thanks for posting the results.
Posted by: Rick Sherman | January 20, 2012 at 12:25 PM
Interesting survey and data. EMC is leading the crest of the new wave of data science. :) Check out my blog on data illuminations: http://dataillumination.blogspot.com/
Posted by: Peter Chen | February 29, 2012 at 11:12 PM
Chuck - I've just rediscovered your blog, and I'm taking a couple of hours to read through old material. I must say I am really enjoying your thoughts and reflections on data sciences, especially the organisational requirements to make predictive analytics succeed for a business.
You mention that you have "made a bit of a practice around sharing with interested parties the patterns we've found in how proficient data science teams are organized (internally and externally aligned) as well as the interesting journeys of how these proficient functions came to be". Anything you can share with me would be great!
Posted by: Tim Johnson | June 10, 2012 at 05:27 AM
Hello... Thank you to everyone who completed my survey. I ended the survey this morning because I had reached my sampling goal.
Posted by: Customer Survey | August 03, 2012 at 04:41 AM