Getting a grip on XQuery

XML seems unstoppable. On the Web, this general-purpose document display language is rapidly replacing HTML. In the office, Microsoft Corp. and OpenOffice have both moved to XML document formats. Even the big database vendors are turning their databases into relational/XML hybrids. That latter effort should soon get a boost, as XQuery, the XML counterpart to the conventional SQL database language, is expected to be become an officially recommended standard of the World Wide Web Consortium.

Jonathan Robie, XQuery technology lead at DataDirect Technologies, a subsidiary of Progress Software Corp., is one of the prime movers behind the XQuery standard. He spoke with Computerworld (U.S.) reporter Eric Lai about XQuery’s importance and its effect on business IT users.

Q. How would you explain the importance of XQuery to an IT manager or CIO?

A. If you’ve got objects, you want an object-oriented programming language. If you’ve got relational tables, you want SQL. If you’ve got XML, you want an XML programming language. You could mix and match and, for instance, use Java to program a relational database. But you would write a lot more code, it’s a lot more complicated, and you end up fighting the data you’re working with. For data integration, XQuery also has a unique advantage. Usually, you have to learn a new [application programming interface] for every different data source you work with. Then you have to do a lot of programming to integrate the data and bring it together. XQuery gets rid of all that. It lets you query everything as if it were XML and create an XML result. You don’t have all of those APIs cluttering everything up.

Q. What are some real-world implications of that? Could you enable more powerful searching of the Web, more powerful than Google today, for instance?

A. There is a full-text search standard for XQuery, but most XQuery implementations now are not about full-text search. I think of XQuery more in terms of when you have a bunch of disparate data sources in your enterprise — some databases, some configuration files, some programs — and you want to integrate all that data for Web services or Web sites.

Q. So XQuery might let you eventually sidestep some of the current data integration, data cleansing or master data management products out there today?

A. Yes, absolutely. People are taking procedural and pipeline approaches to getting data together. That involves a bunch of middleware or Web services interfaces. A simple declarative query is what you’d really like. You can bypass many steps, save on massive amounts of code and avoid potential mistakes.

Q. Is XQuery suitable for real-time data integration?

A. That absolutely depends on your implementation, but yes. To solve such a problem today, you might be using a hodgepodge of things, like JDBC, SQL and some XML API like DOM, with all of this procedural code in between. Because you’re talking to multiple systems and using procedural code, you can’t optimize this. But XQuery is declarative, meaning a good system can take that stuff and optimize it against different data sources in ways you wouldn’t think to. One company that uses our [DataDirect] XQuery product has salespeople go out to sell an HR outsourcing service. They have an Oracle database with information on their customers, along with The sales rep tells the customer they can do their HR cheaper than their in-house guys, so he performs this query, gets the data from different data sources and then creates a PDF document with the information, which is basically their sales offer.

Q. Let’s say you’ve already got an electronic data interchange system. Wouldn’t you rather just keep data in that format and forego XML and XQuery?

A. If your system is using only one kind of data format and doesn’t need XML, there’s not a lot of advantage to bringing in XQuery. The best example is a relational table. If you’re querying it in order to create tables, then there’s no need for XQuery. But if you’re querying it to create XML, XQuery will save you a lot of effort.

Q. So XML and XQuery don’t make relational databases and SQL redundant?

A. Relational databases are certainly here to stay. It’s an established technology, and things that are so useful rarely die. But all relational databases are becoming compound databases. Because if you are doing Web services, Web sites, Web publishing, XML is the format in which your data moves around.

Q. Still, are relational vendors feeling at all threatened by XML and XQuery?

A. Oracle, Microsoft and IBM have all invested heavily in XQuery. IBM is seeing this as a major competitive thing. You also see it in application server environments like BEA. There are 40-odd implementations of XQuery now, and I’d say a good dozen or so are serious at this point. I think you can’t go to a relational database conference without hearing about XQuery. If you look at all of the things you want to do to establish mindshare, XQuery can check off all the boxes.

Q. Is XQuery hard to learn than other XML languages, such as XSLT or XPath? Is XQuery better?

A. If you know XPath or SQL, that is helpful. I teach tutorials where, in the course of one day, people can get pretty dangerous with XQuery.

Related Download
The Landscape of Self Service Analytics Sponsor: IBM
The Landscape of Self Service Analytics
Download this report to examine the current state of self-service analytics across all industries and company sizes, and view the technology decisions and analytical performance of organizations that reported high levels of self-service in their analytical use base.
Register Now