Taming data complexity

While much of the IT world focuses on building computers that are faster, smaller, cheaper and brainier, CEO Peter Lucas and his colleagues at Maya Design Inc. are obsessed with liberating the reams of data that computers contain, regardless of the format in which the data is stored.

To Lucas, computers are little more than “transducers” – necessary but “uninteresting prosthetic devices” for viewing data. “We can’t see data, so we build computers, the same way we use goggles to see infrared,” he says.

What would be much more valuable, Lucas believes, is a computing architecture for sharing data now stranded in relational databases, which he calls “information islands.”

This is also the goal of the Semantic Web, which involves taking a relational database and “webbing it,” according to Web inventor Tim Berners-Lee. Where Maya’s technology differs, Lucas says, is in “taking the much more radical step of freeing the data from any particular Web page or any particular machine.”

Instead of describing data in a standard way or with metadata as the Semantic Web does, Maya’s technology wraps the data in “containers,” which reside in repositories in a peer-to-peer-based “information space” where people can meet and collaborate.

Pittsburgh-based Maya, a spin-off of Carnegie Mellon University, has come up with a container it calls a “u-form” that Lucas says makes it easy to transfer and manipulate data across different computer systems and applications. Higher-level semantics can be layered on top of the u-forms.

What guides the transfer of data from place to place is a set of “shepherds,” or rules-based software agents developed by the data owners. For corporate applications, Lucas notes that u-forms could be encrypted and shepherded only to paying customers.

The same data could be viewed in different ways by different users. For example, a logistics manager could view on his PC a geographic map of warehouses and their contents in a specific region. Meanwhile, an inventory manager could draw on the same data and display on his handheld device a bar-chart representation of goods available for shipment from those warehouses.

In this example, multiple distributed views of the data could be linked in real time, permitting the data itself to become a medium for collaborative work. This is comparable to two users running Excel on the same data set, and every time one of them changes a number, the other’s display is instantly updated.

Maya Design’s Maya Viz software arm has technology it calls CoMotion, a set of tools for building different views of data that’s stored in u-forms. The shepherds tell the u-forms where they can and can’t go, based on the metadata, or data about the data, that’s contained in the unique identifier portion of the u-form. Individual applications on a user’s machine (built using CoMotion’s visualization tools) dictate how data will be displayed.

The U.S. Transportation Command, or Transcom, at Scott Air Force Base in Illinois, is an early beta tester of CoMotion. Transcom is using the software to create different views of the vast amounts of data it must manipulate.

“Since 9/11, we’ve moved 700,000 people and over 2 million short tons of cargo. We have seven requirements databases that we pull from,” each of which uses a different data schema and format, explains Lt. Col. Cody Smith, director of operations. Using Maya’s technology, Transcom is able to display that data differently to its various customers.

“If we’re dealing with ships, for example, we need to be displaying metric feet. Others need to look at tons or short tons of cargo,” Smith says.

Common Understanding

“U-forms and the Semantic Web are aimed at solving different kinds of problems,” says Jason Bloomberg, a senior analyst at ZapThink LLC in Waltham, Mass. “The Semantic Web is aimed more at business-to-business communications, where Company A and Company B need a common understanding of the terminology. A purchase order, for example, has to mean the same thing to both of them.

“The Semantic Web is about getting computers to understand content. U-forms are giving human beings more power in working with systems and content,” he adds.

“The technical breakthrough we’ve made is separating the information from the visualization and manipulation,” says Maya’s Lucas. He foresees a world of peer-to-peer “civic computing” in which virtually all public information is stored in u-forms in a public “information commons” that’s easily usable by anyone, anytime. Maya refers to this vision as the Civium (Latin for “of the people”) Project.

“Instead of using peer-to-peer to steal music, let’s liberate all accumulated public-domain data and create a vast information space to make it freely available,” Lucas says.

Pittsburgh Green Map (www.greenmap.org), an interactive service for locating environmental, recreational and other “green” assets in western Pennsylvania, serves as a prototype of Lucas’ vision. Developed in conjunction with 3 Rivers Connect, a Pittsburgh-based nonprofit environmental group, the service encompasses data from geographic information systems and other types of data from various public databases using different schemas and formats. This data has been converted to u-forms and is virtually located in an “information space,” which is accessed via a “geobrowser” application developed by Maya.

Lucas says this technology is about as mature as the Web was in 1991. “It seems as good an assumption as any that it will follow a similar curve and take about as long,” Lucas says. “That would mean that it will be actually useful to large numbers of people within a few years and will be on the cover of Time in about five years.”