Using all you

It all started out simply enough, or so you thought. Your company needed an Internet presence, so you slapped up a Web site. But after a while it just seemed to sit there, so you gradually added some graphics, maybe a few way-cool animations, some streaming video perhaps, or some thumbnails that optimized to higher resolution photos when clicked upon.

Congratulations: you have now entered the frightening realm of the “accidental publisher.”

And that new role is complicated by the range of data types you now employ.

In fact, the majority of corporate Web sites started out as unknown quantities, with functionality being added a little at a time – until one day, companies find they are suffering from the “boiling frog” dilemma, said Harley Manning, research director with Forrester Research in Cambridge, Mass.

“In theory if you put a frog in a pot of boiling water, it will immediately leap out. If you however put it in a pot of water that’s kind of cool and turn up a very gradual heat, it will go up slowly enough that the frog will boil without ever realizing that it’s in trouble.”

Firms can find themselves in this situation, and they have to recognize they are now publishers, with a different set of problems than the ones that they had to deal with in the past, he said. “If it quacks like a duck it’s a duck. And if you have a Web site with thousands of pages, you’re a publisher.”

Rich media, or content that’s not just static text but contains data such as animations, streaming video, audio files and e-commerce transaction capability, is a very hot property right now, especially in the area of banner advertising on the Web, said Scott Kliger, CTO of MatchLogic Inc. in Waltham, Mass.

“Advertisers are looking for solutions that can give them better results than they can get with animated GIF ads. And rich media ads provide that.”

Interactive ads like the ones his company provides allow users to do everything from playing a game to filling out a form, and they don’t take surfers away from the original site. The banner ad windows start small but can be expanded dynamically, like a window shade, he said.

Lee Barstow, vice-president of business development with WWF New Media in Stamford, Conn., is in the process of testing this type of banner advertising to help drive traffic to his site as well as increase merchandising capabilities.

“We wanted to see, without pushing the bandwidth scenario too far, what kind of interactivity these kinds of banners would have,” he said.

“So the closer we can get to embedding video…without it taking too long to download, the more interesting we can make our advertising.”

However, with about 6,000 videos available for people to pull up on demand, Barstow said the company is also in the process of upgrading its content management system, because “to date we’ve kind of grown this monster, now we are really trying to get our hands around it.”

In fact, managing content, including being able to store and retrieve it effectively,

is something many companies are looking into. Laura Haas, research staff member and manager of multimedia information systems with IBM Corp.’s Almaden Research Center in San Jose, Calif., is working on a research project that uses a “wrapper” to stand between the actual data source and the database engine.

“The wrapper’s job is to represent the data as if it were relational data – to model it for the database engine. The main thrust is on the query processing, and being able to optimize queries as database engines do today, even though the data is not stored locally and we don’t really know any details about how that source works.”

The wrapper, written in C++,

looks just like application source and helps give the data a common view, she said.

“With the rich media, we get data that is either unstructured, or semi-structured, or the structure is there but you need very special techniques to pull it out. So we have special image-processing algorithms [and] special text search engines.”

Another technology, being offered by Newton, Mass.-based Dragon Systems Inc., applies speech recognition technology to “any sort of media content that has speech in it, at the time when you are doing any other encoding,” according to David Wald, the company’s senior technology advisor.

The technology, called AudioMining, does not provide an exact transcript, but can index the media. This proves useful in environments like call centres, he said, because no prior voice-training is required.

“You have a list of words that were found…and pointers to where in the recording they occur. So that then gives you the index of one file – one recording of some sort. The next step is putting this in a database.”