Six Provocations of Big Data – August 11

Joe Provocations Of Big Data

What is the hottest topic on the technology circuit today?  It’s not Facebook.  It’s not the next Apple iPhone.  It’s the explosion of Big Data (which deserves a Big B and D throughout this essay) and what that means for just about every human on the planet.  Big Data is exciting.  It holds a world of promise for everyone from engineers to marketers to the bean counters.

Big Data refers to data sets that are so large that they break traditional IT infrastructures.  More and more, companies, academic institutions and governments are finding the path toward answers leads to Big Data.  But where analytics historically have led to the supercomputer to analyze Big Data, new analytics tools and technologies (including virtualization) are making Big Data a business tool for all.

Enter Boyd and Crawford who do not dare to denounce or renounce the Big Data revolution, but offer six thoughtful counterpoints or warnings to everyone who is jumping on the bandwagon of Big Data.  And by analyzing this article I learned to respect opposing opinions, question the hype of business (including my own employer, EMC), and realize that there are big implications to Big Data.

The first and deepest revelation from the authors that is Big Data will force a new way of learning upon those who analyze it.  I agree.  EMC is sponsoring an entire curriculum around Big Data – coursework and certification that focuses around Technical Ability, Analytical Ability, and Business Acumen.  The EMC Data Scientist certification is meant to be what Microsoft Certification was at the turn of the century – the gold standard of certification.

The next provocation revolves around claims to objectivity in Big Data.  For example, let’s examine the ability to take a quantitative approach and apply it to social spaces.  And the fact that in the end of the day, a human being will be taking analytics and interpreting those analytics for the end customer/user.

Boyd and Crawford continue with another provocation entitled, “Bigger Data is not always Better Data.”  They use Twitter as the example of Big Data that cannot always concisely be analyzed.  Not at Twitter users and tweets are the same!   In a nutshell, the authors really implore the reader to recognize that size isn’t everything when it comes to data and an inquiring mind must still impart his/her knowledge to provide the correct, rich analytic experience.

Equally interesting is the notion that not all Big Data is equal.  Their most vivid illustration of this is the three types of networks analyzed for the article.  Articulated Networks are the most straightforward.  Think about your Outlook address book – we all have one.  How up-to-date and accurate is it?  Are all connections in that address book of equal value?  (Only the creator will know for sure).  Second, there are Behavioral Networks that are derived from communication patters.  Think about how you use your LinkedIn Connections to network.  Finally, the unique Personal Networks that are lurking behind the scenes.  Everything from your local grad school alumni network to my network of high handicap golfing buddies.  Analyzing the personal network becomes more complicated than the larger, articulated network.

The authors then make two very strong arguments for social responsibility and Big Data:  (1) Taking an ethical approach to Big Data and (2) Warning of the impending digital divide of Big Data.

On ethics, it does bring very legitimate concerns.  When there is a wealth of data about each one of us, we lose our ability to give permission for use of that data.  And on the digital divide, will money be a deciding factor on who gets access to Big Data? And while one can argue that Big Data will drive down cost, for some areas like academic or the public sector, access could be restricted by the ability to (1) not have the proper talent to analyze Big Data and (2) not have access to the best technology.

Enter another topic near and dear to my heart but not addressed in the article:  the impending talent shortage of people who are properly trained in Big Data.  Coursework like the Data Scientist certification will help that challenge, but the need to bring more young students into STEM (Science, Technology, Engineering & Mathematics) is real.  And it is a burden that must be shared at all levels.

In summary, the article takes a very new and exciting topic  and puts forth provocations in the way a businessperson like myself views Big Data.  Absolute kudos go out to Boyd and Crawford for effectively putting forth strong, credible arguments to the opportunities in a Big Data world.

Leave a comment


  1. Hey Joe, great job with the presentation! I’m going to respond here later when I can give a proper response…

  2. Thanks Eric.

  3. Big Data is definitely HUGE right now. I completely agreed with this argument throughout your presentation. The ethical implications of this topic are of the most interesting to me, because its quite scary when you think about it – despite how helpful it is.

    Personal questions that come to mind for me are:

    What are companies like yours (EMC) doing with my data?
    Will my behaviors with things like purchases change the outcome of things in the future for better or for worse?
    What will happen with that data? Will it remain archived somewhere or be deleted at some point.

    Look, all of this stuff is great when it’s not harmful. I take a “What I don’t know, can’t hurt me” approach with most of it. But, when it’s used for predicting my future behaviors, that’s when I don’t like it.

  1. Session 4 – Global Systems, Watchdogs and What Comes Next « UW Digital Democracy

What do you think?

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s