The hype around big data reinforces the image of organizations bent on collecting gigantic amounts of data from every imaginable source in order to glean precious bits business intelligence from the information goldmine.
In a recent article which calls big data “one of the worst industry terms ever invented,” the online technology publication, Wired.com said that acquiring vast quantities of data does not guarantee a company will get any benefit from it.
For instance, a survey of CIOs by big data consulting firm NewVantage Partners found that only 28 per cent of enterprise organizations identified data volume as the primary driver of their big data projects. As many as 60 per cent of the respondents said it was the ability to ingest disparate data sources and make sense of the information they provide in real time that was the real driver behind big data interest.
Microsoft Research also found that a majority of “real-world analytic jobs process less than 100 GB of input.”
Rather than treating Hadoop as giant “digital landfill” and dumping data into it, companies should avoid hoarding data because this only serves to “increase the noise” and makes it harder for the organization to determine the best course of action, according to the article.
Companies are better off determining first which questions it needs to answer, develop a hypothesis of the data sources that will answer these questions and then use a “flexible data infrastructure” to collect the data needed.