Tagging: It’s no longer fun and easy


Most people think that tagging on the Web is pretty easy and fun. Give ‘em a blog or a Web page and a field named “tags,” and they’ll start stuffing in text with wild abandon in the hopes that their content will be easily found by people who are desperately searching for information and opinion on feline hairball cures or cycling in the Ozarks or whatever their particular hobby is.

Alas, all these folks are doing is polluting the Web.

Tags arose out of a need for a way to classify Web page content and blog entries that the big search engines, such as Google, couldn’t find or ignored. Tagging also appealed to people because it was a democratic technique that was fast, easy and had a perceived payoff. If that payoff ever existed it was back when the blogosphere was smaller and tagging hadn’t gone mainstream. Today, I doubt there is much of a payoff anymore.

The trouble is, rot has set in, and tagging has developed a few significant problems that are making it progressively less valuable. This is not to say tagging is, per se, a bad thing, merely that its popularity and the lack of standards have ensured that its utility value will continue to degrade. This degradation ensures that tagging will turn into a bigger source of content “noise” as every day passes.

The first problem with tagging is semantic vagueness. For example, does the tag “china” apply to the country or crockery? While you might hope that the distinction between the two would be evident from examination of related data, such as other tags used for the same item, specific words used in the item or in the rest of the site hosting the item, the effort required to resolve the context wipes out the value of tagging in the first place.

A second problem is that the format of tags isn’t standardized. This means that issues such as how white space is handled, which characters are legal, and which characters have special meanings and what those meanings are go undefined.

The third and perhaps biggest problem is the overuse of tagging. How often have you seen a blog item with a list of tags almost as long as the item itself? This is a direct result of the optimism of tag authors — they want to cover all of the bases so their content can be easily found.

This last problem underlines the messiness of tagging and why the noise generated by tags is growing so rapidly. Any index of tags from a given set of Web or blog pages is gigantic, and each tagged item has scores of closely related tag variants with little or no syntactic distinction. In other words, a big mush of text.

The result is that automated systems for finding, indexing and searching tags across multiple sites such as Del.icio.us and Technorati will continue to become less valuable, because they deal with ever greater levels of noise. Even so, tagging will survive but it will have to evolve to retain relevance. I know that those of you who use it for your blogs and Web sites will probably not give up on tagging too soon, but mark my words: in the near future you will either not be bothering with tagging or you’ll have moved on to the next generation of tagging which will be more complex (probably based on XML) and demand more effort to use. Tagging will no longer be fun and easy.

Gibbs is a columnist for Network World (U.S.). Contact him at backspin@gibbs.com.

QuickLink 071016


Related Download
The Landscape of Self Service Analytics Sponsor: IBM
The Landscape of Self Service Analytics
Download this report to examine the current state of self-service analytics across all industries and company sizes, and view the technology decisions and analytical performance of organizations that reported high levels of self-service in their analytical use base.
Register Now