Big Data Lacks Big Questions


Nowadays, people are all talking about Big Data, and people all think Big Data is useful and meaningful. However, I found an interesting article, which pointed out the problem “Big Data actually lacks Big Questions”.

 In all the excitement and curiosity about analytics, a fallacy has been allowed to persist: that analytics is linear. Conventional wisdom sees analytics as the process of asking the right question, applying the right analytic technique, and waiting for the right answer. Short, sharp, shocks, and voila—an insight.

Analytics – like life – is far more nuanced. Almost anything worthwhile, whether it’s creating a work of art, training an Olympian, or doing sophisticated data analysis, requires a combination of techniques. Big data analytics should be more like conducting an orchestra. When each musician plays his or her instrument independently, the cacophony is enough to drive away the most patient critic. Under the direction of a skilled conductor, the orchestral swells will take your breath away.

In analytics, the emerging term is multi-genre analytics, where a number of analytical techniques—text, path, graph, and statistics—are brought together seamlessly to answer pressing and complex questions. True multi-genre analytics should provide not only the ability to mix and match techniques, but the ability to do so easily. Conceivably, you could buy 10 different solutions and shoehorn them together. This is difficult and typically results in all kinds of issues.  Having multiple techniques available in one solution makes the analyst’s job easier.

You know those eureka moments the big data world loves to crow about (and advertise)? Ever wonder when your moment will happen? Multi-genre analytics is an approach meant to deliver those kinds of results. Its foundation lies in getting us out of the habit of asking relevant but small questions like “Why do I have so much inventory on the shelf?” or “How could we increase customer retention rates?”.  Data is good at providing descriptive answers to these kinds of questions, but not necessarily at supplying answers that explain the bigger picture. In fact, taking a narrow view of the problem can lead to a false conclusion.

Consider a web retailer with a large inventory of designer jeans. They can see that $50 jeans are flying off the shelf, but $100 designer pairs are sitting in an expensive heap. Upon confirming this with a database query where the actual sales volumes are much lower than forecast numbers,  the  obvious answer is to put the designer jeans  on sale, reduce the inventory, and focus on the less expensive brand, right? Clearly, price is the only motivating factor for customers, isn’t it?

Wrong! As it turns out, other techniques including website path analysis, text analysis of customer feedback, sentiment analysis of social media, and graph analysis —all distinctly different analytics techniques with each delivering insights complementing the others—revealed a fuller picture: people weren’t complaining about price, preferring the cheaper item, or any of the things that the retailer expected. Instead, customers were complaining about how hard it was to find designer jeans on the website. It wasn’t a problem of demand. It was a website navigation issue. And the issue was invisible until the retailer made sense of analytics from a variety of sources.

So why isn’t everybody doing this already? It’s not for lack of trying. Plenty of enterprises conduct combinations of text analysis, Hadoop-based path analysis, graph analysis, and machine learning. The problem is that they use separate systems and stitching their results together yields a Frankenstein’s gallery that do not form a coherent picture. Feeding the output of one analytic technique into another technique can be cumbersome without the right technology in place. And consider the technical challenge of solving this on your own. You need to know how to program Hadoop, the text analytic systems, and manage the graph database if you are going to make them all dance together. Until you have a platform in place designed to handle and abstract that complexity, you are likely to be forced to accept simple answers to smaller questions.

Whether you’re selling jeans, approving loans, developing medicine, or maintaining the disclosure wall between your bank and its clientele, multi-genre analytics is your opportunity to take a bigger look inside your big data to find bigger answers than you expect.  Better still, you can invite more people to start thinking along these lines and get them invested in the analytics process, spurring them to ask bigger questions as well.


This entry was posted in Uncategorized. Bookmark the permalink.

One Response to Big Data Lacks Big Questions

  1. sydhavely says:

    Extremely relevant and insightful, Xingyun. Indeed Big Data and those who rely on its information need to understand its nuances and the questions that need to be put to Big Data in order to fulfill its potential. Thank you.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s