HP tackles unstructured data

VIENNA — People typically associate Hewlett-Packard Co. with infrastructure and services. But the vendor is now using that infrastructure as a foundation for analyzing and exploiting Big Data.

“In the world of human information, things never match,” said Yves de Talhouet, senior vice-president and managing director of EMEA for Hewlett-Packard Co., during Discover 2011 being held here this week. Relationship databases attempt to hammer the world flat by normalizing data, but that doesn’t account for shades of grey — they can’t understand or organize unstructured data.

Keywords and tags fail, since they involve manual processes, multiple definitions and diverse data, from text, sound, XML, video to audio. Meaning is dynamic and changes over time; information is defined by context.

“Why does this matter? Because 85 per cent of information inside a modern enterprise is still in this (unstructured) form,” said de Talhouet. “Keywords and metadata do not solve this problem. We need to automate the processing as well as retrieval of human information,” which, he adds, is growing three times faster than structured information.

In a recent survey of senior business and technology executives conducted on behalf of HP, 48 per cent of respondents said they don’t have an effective information strategy in place, and only two per cent can deliver the right information at the right time to support enterprise outcomes.

Also, 34 per cent of respondents said that 40 per cent of information within the organization is unconnected, undiscovered and unused, while 35 per cent said they’re not effective at accessing enterprise information as needed.

The problem is Big Data — the huge volumes of data that need to be managed in real time. Only 15 per cent of an organization’s data lives in databases, while unstructured data accounts for the remaining 85 per cent. Consider this: There are 97,000 tweets every second, 12 million texts every minute and 294 billion e-mail messages every day. Organizations are dealing with extreme data — volume, velocity, variety and complexity.

HP announced its IDOL 10 platform, which is designed to handle structured and unstructured data, inside and outside of an enterprise. This combines a layer from Autonomy (which HP acquired in October) with a real-time analytics engine from Vertica (which HP acquired earlier this year), allowing companies to analyze what HP says is “100 per cent” of unstructured, semi-structured and structured data.

HP also announced HP Autonomy Appliances (powered by IDOL 10), which allow organizations to quickly use and extract metadata from all data sources.

The vendor is putting a heavy focus on gleaning insight from that unstructured data with its HP Social Intelligence Solution, which is designed to integrate and extract value from social media data. “This is the right time for us to introduce it into the marketplace,” said Srini Koushik, vice-president of strategic enterprise services with HP worldwide applications and business services.

“We have a delivery model that leverages on-shore and off-shore capabilities, but also leverages a very extensive partner ecosystem,” he said. “That’s the big difference between us and our competitors.”

HP’s strategy includes sentiment analysis that looks at trends in social media and unstructured data. This can help enhance a company’s understanding of customers, protect brand reputation and spur product development, said Koushik.

The market for sentiment analysis and managing unstructured data is big, in that everyone can benefit from this granular level of business analytics, said Michelle Warren, principal analyst at MW Research & Consulting. The question is who is willing to pay for it. “In Canada, this means the larger firms — financial institutions, universities, health care and medical research industries,” she said.

BI is a growing field, and over the past couple of years more vendors have tried to differentiate themselves by offering new and innovative ways to manage data. HP has benefited — IBM too — by acquiring some of the smaller firms, said Warren.

HP poses a significant threat in this market, due to the size of its sales and marketing channel, including reseller partners. But it has also been criticized for not being a pure BI vendor, with fingers pointed at both IBM and SAS as being industry leaders.

But in the end, all three vendors offer possibilities, said Warren, and it will be up to partners and customers to decide which solution is worthy of the significant investment of time, energy, IT resources and IT budget — not to mention human capital — required to implement and roll out the solution.

“I wouldn’t rule HP out on the basis that it is relatively new to the BI field,” said Warren. One of its key selling features is its considerable consulting and implementation practice, which can integrate all of the required technical components, including hardware, software, and managed services. “Yes, SAS and IBM can do the same, but it is a key differentiator for HP.”

Related Download
3 reasons why Hyperconverged is the cost-efficient, simplified infrastructure for the modern data center Sponsor: Lenovo
3 reasons why Hyperconverged is the cost-efficient, simplified infrastructure for the modern data center
Find out how Hyperconverged systems can help you meet the challenges of the modern IT department. Click here to find out more.
Register Now