XML storage has a niche

Of all the XML-based technologies available to us, the one that may be considered the most revolutionary may also be the most neglected. I’m talking about the native XML database.

Let’s face it: relational databases are firmly entrenched in the enterprise. They have been around for years, there are volumes of published design theory and, most importantly, they are efficient and reliable.

I don’t view XML and relational storage systems as competitors; each has its strengths and may be better suited to some purposes. There is room for coexistence in a well-planned environment.

But what exactly is an XML database? There’s no definitive answer. Some may say any database that can import/export XML. Others may think a relational database with an XML interface qualifies. For the purposes of this discussion, we’ll consider an XML database to be one that does not have any of the traditional table/row/column design elements, but stores XML data in collections of documents. A freely available implementation is Apache Xindice (pronounced “zeen-dee-chay”), found at http://xml.apache.org/xindice/.

Let’s suppose Company X wants to build an electronic filing system. All of its corporate documents are to be held in a single repository so that it may be queried for keywords. What’s the more appealing option — modeling each document type in a complex series of relational tables, or just creating a collection that can hold any number of dissimilar documents without the need for field mappings? Personally, I prefer the latter.

Each document is going to have an author and a creation date, for example, and these allow you to query across documents that may not have any other commonalities. If Company X used an XML database for its filing system, it would be simple for the president to see every piece of information saved during the last week.

Now that Company X has its internal information easily accessible, it wants a way to collect customer data. An application to build customer surveys as Web forms is required. The marketing department is responsible for these surveys, and the IT department doesn’t want to have to set up a database table and field mapping for each new survey they come up with. The answer, again, is an XML database.

The question of when to use a relational database and when to use an XML database can be answered by considering the need for structure. Relational databases enforce the integrity of the data they hold, and the integrity of relationships between data. XML databases are the clear choice when you don’t know up front what structure the incoming data is going to have, or when there are too many different structures to feasibly build tables.

I’d love to hear from anyone with an XML database success story.

Cooney works as a programmer/analyst for a major Canadian book publisher. He can be reached at robert_cooney@hotmail.com.