Why big data isn't always the answer

19.08.2015
Listen to much of the well-peddled advice in the enterprise tech world today, and you'd have to be excused for coming away with the belief that "big data" holds all the answers your company is looking for. Too bad it often can't live up to that promise -- at least, not in its traditional form.

Turns out, what's commonly referred to as big data -- all those vast "lakes" of numerical measures captured by the enterprise resource planning (ERP), consumer relationship management (CRM) and other business systems so enthusiastically mined by today's analytics tools -- actually amounts to only 10 percent of the data an average company has at its fingertips, according to IDC.

The rest is "unstructured" or "qualitative" data, and it can be messy. Included in this type is information from customer surveys, response forms, online forums, social media, documents, videos, news reports, phone calls to call centers and anecdotal evidence gathered by the sales team, to name just a few examples. It's typically textual rather than numerical, and it's not easily "quantified," or turned into numerical values.

Therein lies a problem. While most analytics tools are set up for quantified information -- crunching numbers, in other words -- it's often the unstructured data that provides the context and the meaning companies needs to make that information useful.

"Data can often raise more questions than provide answers, and there is always the question of 'why' behind the quantitative data trends," said Anjali Lai, an analyst with Forrester Research. "Data analyzed in a vacuum risks telling an incomplete story, and qualitative data can provide this contextual view."

Imagine you're at a company trying to understand why online sales aren't where you'd like them to be. You can invest heavily in marketing analytics tools that give you activity-based data such as how long users spend on which Web page on average or the abandonment rates for users' shopping carts, for example. Such data, however voluminous, still won't necessarily give you the "why" part of the equation.

"You may know you have 10 thousand unique visitors to your site -- that's quantitative data," said Collin Sebastian, chief product officer for YouEye, which offers software and services designed specifically with qualitative data in mind. "Qualitative data can tell you that four thousand of them came in excited about a particular topic, this is what they hoped to learn, this many did not have their expectations met, and this is what they did instead."

Qualitative data can go beyond identifying correlations between data points, which may tell you, for example, that people who spend longer on your website tend also to make purchases. Instead, qualitative data can begin identifying causal relationships, or that elusive "why" Do people make purchases because they've spent longer on your site, for instance, or are they simply on the site longer because your purchase process is cumbersome

"The average CMO today is managing up to 14 dashboards," Sebastian said. "It's a case of analysis paralysis: I have 10 million data points that are looking at my question in 17 different ways, but I have no context for what the meaning is."

Historically, the analysis of qualitative data has tended to be very manual and human-intensive. "You could not just submit a database query against a document and get some numbers back that you can feed into a visualization," explained Kirk Borne, principal data scientist at Booz Allen Hamilton.

Qualitative data analyses have typically been limited in scope as a result, but that's starting to change. Not only are there more tools and packages designed specifically for qualitative data, but there are also "increasingly clever ways that qualitative data are being transformed into quantitative data, thereby unleashing the full power of quantitative analytics on the qualitative data also," Borne said.

YouEye conducts online studies using video and audio to record users interacting with clients' websites, advertisements or other materials. Typically, between 50 and 300 users are selected for each study, depending on what the client hopes to learn. Video gets transcribed and coded using a combination of human coders, natural language processing and machine learning. At the end of the process, clients get a highlight reel to illustrate the results.

In the case of a coffee vendor, for instance, "we ran participants through interactions with the product and ended up with a highlight reel of every time they mentioned a competitor instead," Sebastian said. "You immediately have a causal analysis of why you're losing customers -- that's not something quantitative data can tell you."

QSR International is another company that makes software tailored specifically for qualitative data, and its NVivo product is used at Gallup. Though best-known for its national polls, Gallup also consults with organizations to help them understand the emotional aspects of their relationships with their customers, and that involves lots of qualitative data.

"There are certain key research questions that are inadequately captured by quantitative methods alone, including why a customer is actively disengaged or indifferent towards a supplier or the feelings, motivations and thought processes of a customer as they experience a service," said Ilana Ron-Levey, a researcher and strategy consultant with Gallup. "When we partner with an organization, quantitative data allows us to assess how widespread particular opinions are, but qualitative data is essential for us to understand the meaning behind particular frequencies and distributions," she said.

Gallup uses a variety of techniques to assess customers' perceptions. In a recent business-to-business project, for instance, it interviewed more than 100 senior-level customers face-to-face and collected both qualitative and quantitative data on topics focused around customer engagement, Ron-Levey recounted. The team inductively analyzed the qualitative data using NVivo as well as manual coding methods in Microsoft Excel.

Armed with the results, Gallup used the quantitative responses to statistically explain the drivers of customer engagement. It tapped the qualitative data to describe how these drivers looked and felt to customers. From there, "we gleaned how strategies to improve engagement differed by customer type," she said.

Working with qualitative data in software is often a matter of giving it numerical form, such as by assigning a numerical rank or score to specific qualitative responses or comments. In sentiment analysis, for example, the researcher typically assigns a positive or negative value to the sentiment being expressed in the qualitative data, and then assigns a numerical value to the strength of that sentiment, Borne noted.

Text analytics involves summarizing the content of textual information in quantitative ways, such as via topic models and heat maps, while natural language and semantic processing techniques extract meaning from the language, whether written or verbal.

Transforming qualitative data into quantitative form can involve some subjective decisions. "That is a challenge, but it is also a rich opportunity," Borne noted. "There are far more subtleties and intricacies in language that we can use to extract deeper understanding and finer shades of meaning."

A variety of algorithms are at work in QSR's NVivo, analyzing data for commonly used words or sentences, for example. Numerous visualization tools are on hand to make interpretation easier, meanwhile, including word clouds and tree maps.

"This gives you a strong visualization of viewpoints and underlying reasons," said John Owen, the company's CEO.

Qualitative data collection tends to be time-consuming and requires a high level of skill by the researcher to establish rapport with respondents and mitigate bias, Gallup's Ron-Levey pointed out.

"One of the most underestimated and oft-overlooked skills that the qualitative researcher needs to have is a sense of empathy," Forrester's Lai agreed. "Often times today, qualitative researchers rely solely on coding without thinking critically about the subtext or context of the data."

Working with the data and validating the research model can also be more complicated. Where it might be sufficient to normalize a set of quantitative data on a zero-to-one scale, for instance, it's trickier with qualitative data, Borne noted.

"Standard statistical tests that might be testing hypothesis A vs. hypothesis B won’t work when there are many shades of meaning and many degrees of understanding embedded within qualitative data," he explained.

On the analysis side, it can be tempting to generalize findings beyond the sample set being studied, Ron-Levey warned.

Done well, however, qualitative data can be well worth the effort.

"In an era of big data, there is a continual search for human meaning behind the numbers," Ron-Levey said. "Understanding feelings, motivations and perceptions often leads to innovation and new strategies that more authentically appeal to customers."

Said Forrester's Lai, "qual and quant insights are the 'yin and yang' of customer understanding, because both are needed to tell a complete story about consumer behavior."

Katherine Noyes

Zur Startseite