Toronto start-up tackles language analytics

A Toronto-based start-up has launched a free Web-based application programming interface (API) that analyzes text to determine sentiment and key words. said it offers customers a Web service that is no different than the APIs users can find on Twitter or Facebook. The tool will cull data from desired social media sites, store the data into a searchable archive, and then let IT professionals discover what is being said about their organization or employees on the Web.


Martin Ostrovsky, founder and CEO at Repustate, said his company has no aspirations to provide a centralized repository of all the social media and networking data on the Web. Instead, it will focus on giving developers, marketing firms and IT professionals a service that will extract and analyze content from the online sites they deem relevant.


“We actually don’t explicitly spider or poll any site, but rather rely on our users to grab data from their desired social media sources and push it into our search index,” said Ostrovsky. “So, as an example, if a user is interested in comments from Facebook about the World Cup, they must sign up for Facebook’s API, grab the data they want, and then push it into our API.”


By default, all data that is processed through the Web service is made public, but developers at large enterprises do have the option to store their data locally or license the Repustate software within their walls for a small fee.


For companies looking to keep tabs on what customers are saying about them online, Ostrovsky said, the biggest roadblock to getting that clear picture is determining the useful chatter from the irrelevant chatter.


“The amount of useful information being generated is dwarfed by the noise, but it is out there and it’s tough finding it,” he said. “By keeping the quality of our search index high, we ensure that only truly valuable information gets pushed into our search index. The second issue is how to act on the data.”


To address this question, Repustate aims to let organizations discover how users feel about their products and services. To identify positive, neutral or negative comments, Ostrovsky said, the tool uses natural language processing (NLP) — a technique which aims to allow machines to dissect English text and extract out sentiment, key phrases and named entities.


“Depending on sentence structure and various words used, we can detect the sentiment of a particular sentence or group of sentences,” he said. The tool can also extract key adjectives and phrases from any piece of text to analyze the frequency with which certain terms are used to describe your brand.


To show off the API, Repustate created a searchable database of TechCrunch articles and fed it through various Repustate API calls.


For another example of the tool in action, Ostrovsky pointed to Apple Inc.’s recent iPhone 4 launch. For a vendor looking to compete with the Apple in the handset market, being aware of what users are saying about the device on social networks can help them avoid negative sentiments when designing their own phones, he said.


Any enterprise IT shops considering an iPhone rollout could use the Repustate API to gain insight into some potential security or usability drawbacks, Ostrovsky said.


“I’ll grab all negative sentiment about the iPhone, extract out adjectives and verbs and then analyze the results,” Ostrovsky said. “It might turn out that people are using the word ‘iAds’ a lot when being negative, meaning they don’t like Apple’s new advertising policy for the iPhone.”


For IT managers tasked with an online reputation tracking project, Ostrovsky said companies should consider Repustate for its open and transparent platform.


“You can try us out without committing many resources,” he said. “You can suggest features to us and we’ll build them for free so not only you, but our other users can benefit. I call this crowd-sourcing our API, which I don’t believe has ever been done.”


Repustate arrives on the scene as more and more companies express an interest in their online reputation and Web analytics.


Earlier this week, IBM Corp. announced its purchase of Santa Mateo, Calif.-based Coremetrics to help shore up its business analytics software and services portfolio.


The Coremetrics service is geared toward helping companies get e-mail pitches and personalized recommendations to the appropriate customers, and use customer analysis to place ads on the most-suitable networks and better manage the purchasing of search terms.  


“Coremetrics expands IBM’s business analytics capabilities,” said Craig Hayman, general manager of IBM’s application integration middleware business. Using “a cloud-based delivery model,” Coremetrics helps its customers “get real-time insight into their interactions with their consumers and prospects,” through e-mail and social media outlets, he said.


As part of the acquisition announcement, IBM also revealed some results of a survey it took of CEOs, including that 82 percent of CEOs who were surveyed want to better understand customer needs.


– With files from Joab Jackson, IDG News Services (New York Bureau)

Related Download
3 reasons why Hyperconverged is the cost-efficient, simplified infrastructure for the modern data center Sponsor: Lenovo
3 reasons why Hyperconverged is the cost-efficient, simplified infrastructure for the modern data center
Find out how Hyperconverged systems can help you meet the challenges of the modern IT department. Click here to find out more.
Register Now