When small data is better than big data

The current focus on big data tends to fuel the idea that very large data sets mean value. In reality they don’t, instead enormous data sets most certainly require more resources, more hardware and expensive software licences.

Large data sets also require more labour to manage and analyze it. And usually, the extra expense is not worth it, according to analytics consultant and writer Meta S. Brown.

In a recent blog, Brown observed that prospective software buyers are often caught up with the idea that they are dealing with large data sets and therefore require software that can handle the load. Social media analytics is one area where this is often the case.

Sentiment analysis is relatively new, she said, and everyone wants to know what people are saying about their brand in the social networks. As a result, many service providers collect tweets and social media post and analyze them for sentiments – positive, negative or neutral and give the summary to their clients.

But Brown said the downside is that sentiment analysis is not precise so the figures arrived at would always be rough estimates.

Another weakness, she said, is that automated tools that select relevant mentions are “less than perfect.” Clients might obtain a pie chart from a sample of a few hundred cases and its results would not be significantly different from results of analysis of millions of mentions.

In this case, Brown said, using more data just means “making the job harder than it needs to be” and more expensive.

There are occasions when big data is really needed and there are times when they’re not needed.

For instance, when marketers preparing personalized product recommendations need to decipher the behaviour of many people as individuals, big data is important.

If the goal is to summarize the behaviour of groups of people or things, Brown said, smaller data sets are enough. With smaller data samples it is easier to correct data quality problems and it is easier, faster and cheaper to analyze smaller data sets.

Read the whole story here


Nestor E. Arellano
Nestor E. Arellano
Toronto-based journalist specializing in technology and business news. Blogs and tweets on the latest tech trends and gadgets.

Would you recommend this article?


Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.

Jim Love, Chief Content Officer, IT World Canada

Featured Download

ITW in your inbox

Our experienced team of journalists and bloggers bring you engaging in-depth interviews, videos and content targeted to IT professionals and line-of-business executives.

More Best of The Web