Edmunds.com deploys text mining tool for user forums

Edmunds.com Inc., an online service for vehicle information, unveiled its latest tool to mine the potentially invaluable data stored as unstructured content in its user forums, consumer ratings, and reviews archives.

Currently, Edmunds.com has more than 2.5 million messages and 100,000 car reviews with consumers providing personal reviews, lists of favorite features, and suggestions for improvement.

In beta trials now, Edmunds.com will be deploying a technology from Attensity Corp. called PowerDrill that converts written language into relational data.

PowerDrill takes the unstructured data, namely sentences, and diagrams the sentences placing each part of speech, such as noun phrase, verb phrase, and prepositional phrase, into a separate field, actor, action, and object which can then be used by a standard database to discover relationships and trends.

Although text mining for content intelligence is covered by a number of other companies such as ClearForest Tags, Inxight SmartDiscovery, and IBM WebFountain products, Laura Ramos, vice-president at Forrester Research, called Attenisty’s diagram capability unique.

“The use of diagramming rather than rules or examples is more accurate and specific,” said Ramos.

In a sentence such as, “the bolt on the under-carriage of the car is cracked due to heat,” other products that use linguistic or grammatical rules would assume the under-carriage was cracked because that word was closest to the verb, Ramos said. However, because Attensity diagrams the sentence, it understands that the bolt, not the under-carriage, is cracked, said Ramos.

Attensity integrates the relational data it created from text with other pre-existing structured content and outputs the result in any format, including XML, said Crain Norris, Attensity CEO. “PowerDrill can diagram Moby Dick in five seconds,” he added.

Using PowerDrill, Edmunds.com plans on tabulating suggestions for improvement and the ranking of favourite features.

In a test with Honda Odyssey, a highly anticipated 2005 car model, information from drivers of previous model years was tabulated, allowing Edmunds.com to show that the most needed improvements were in road noise, transmission issues, and styling.

Edmunds.com was able to analyze trend information from conversations on the forums, including shopping and dealer behavior, re-occurring issues, and concerns which can also be used to predict future behaviour.

Companies such as Edmunds.com as well as government agencies are suddenly waking up to the richness of unstructured data, according to one industry analyst.

“They may have been aware of its existence but didn’t think they could extract any value out of it,” said Nick Patience, a senior analyst at The 451 Group.

Related Download
Virtualization: For Victory Over IT Complexity Sponsor: HPE
Virtualization: For Victory Over IT Complexity
Download this white paper to learn how to effectively deploy virtualization and create your own high-performance infrastructures
Register Now