Speech recognition is finding its voice

As workers become more mobile and companies rely more often on good, fast, always-available support to hold customers, services based on speech recognition are starting to come to the rescue. Emerging industry standards are helping to open the market and pave the way for broader adoption, according to industry experts.

Speech recognition makes it practical to do things with a phone that would be too complicated using the 12-digit keypad. In some cases, callers won’t have to work their way through a hierarchy of options by pressing numbers or saying words. Although it’s not at the point where systems can understand everything a caller might say, callers no longer have to use specific words. Voice recognition also can trigger transactions without a live operator. Combined with speech-to-text and text-to-speech technology, it can support even more emerging applications.

Providing a speech-based interface to applications is a good thing for companies to outsource to a carrier, says Mark Plakias, an analyst at Zelos Group. In most cases, access to an application such as ERP by voice is only a small fraction of the use of the application, he says.

“There’s no reason the enterprise should have to go out and buy a telephony platform to do this,” Plakias says.

On the other hand, Marcello Typrin, director of product marketing at speech software vendor Nuance Communications Inc., says companies can save on operating expenses by owning their own equipment, and more are doing so as they become confident in the technology.

Companies and carriers are using speech recognition because it’s getting better, according to analysts. More powerful processors and refined algorithms are at the core of the improvement.

Now, at the application development level, two new specifications that extend current mark-up languages are helping companies and service providers get started.

Voice XML (VXML) is an extension of XML that lets developers for corporations and service providers take advantage of work that already has been done to put applications and information on the Web. Released in Version 1.0 in 2000, it is now in Version 2.0. VXML has opened the voice-based market to new vendors, such as start-up VoiceGenie Technologies Inc., while leading existing vendors to offer alternatives to their proprietary software platforms using VXML interpreter software.

Meanwhile, the Speech Application Language Tags (SALT) standard, backed by Cisco Systems Inc., Intel Corp. and Microsoft Corp., also is coming on the scene. The platform is based on extensions of scripting languages, including HTML and XML. Microsoft released the first beta of its SALT-based Speech Server in July, but it is marketing the platform directly to corporations and not to service providers.

Analysts and industry participants are optimistic that VXML and SALT, both of which have been submitted to the World Wide Web Consortium, won’t develop the type of rivalry between them that has stymied development in other areas. The two specifications are heading toward becoming one and might merge by the end of next year, according to Plakias.