Site icon IT World Canada

When small data is better than big data

The current focus on big data tends to fuel the idea that very large data sets mean value. In reality they don’t, instead enormous data sets most certainly require more resources, more hardware and expensive software licences.

Large data sets also require more labour to manage and analyze it. And usually, the extra expense is not worth it, according to analytics consultant and writer Meta S. Brown.

In a recent blog, Brown observed that prospective software buyers are often caught up with the idea that they are dealing with large data sets and therefore require software that can handle the load. Social media analytics is one area where this is often the case.

Sentiment analysis is relatively new, she said, and everyone wants to know what people are saying about their brand in the social networks. As a result, many service providers collect tweets and social media post and analyze them for sentiments – positive, negative or neutral and give the summary to their clients.

But Brown said the downside is that sentiment analysis is not precise so the figures arrived at would always be rough estimates.

Another weakness, she said, is that automated tools that select relevant mentions are “less than perfect.” Clients might obtain a pie chart from a sample of a few hundred cases and its results would not be significantly different from results of analysis of millions of mentions.

In this case, Brown said, using more data just means “making the job harder than it needs to be” and more expensive.

There are occasions when big data is really needed and there are times when they’re not needed.

For instance, when marketers preparing personalized product recommendations need to decipher the behaviour of many people as individuals, big data is important.

If the goal is to summarize the behaviour of groups of people or things, Brown said, smaller data sets are enough. With smaller data samples it is easier to correct data quality problems and it is easier, faster and cheaper to analyze smaller data sets.

Read the whole story here

 

Exit mobile version