If the Large Hadron Collider at CERN is to yield miraculous discoveries in particle physics, it may also require a small miracle in grid computing.
Undaunted by a lack of suitable tools from commercial vendors, engineers at the famed Geneva laboratory are hard at work building a giant grid to store and process the vast amounts of data the collider is expected to produce when it begins operations in mid-2007. They announced last week that the computing network now encompasses more than 100 sites in 31 countries, making it what they believe is the world’s largest international scientific grid.
Inside the collider, photon beams travelling in opposite directions will be accelerated to near the speed of light and steered into each other using powerful magnets.
Scientists hope to analyze data from the collisions to uncover new elementary particles, solve riddles such as why elementary particles have mass, and get closer to understanding how the universe works.
The photon collisions will produce an estimated 15 terabytes of data each year, or more than 15 million gigabytes. The role of the grid is to link together a vast network of computing and storage systems and provide the scientists with access to the data and processing power when they need it.
The grid sites involved are mostly universities and research labs as far afield as Japan and Canada, as well as two Hewlett-Packard Co. data centers. The sites are contributing computational power from more than 10,000 processors in total, and hundreds of millions of gigabytes in tape and disc storage.
For all the talk about grids from big IT vendors, virtually no suitable commercial tools were available to build the grid’s infrastructure, according to project leader Les Robertson. Much of the data will be stored in Oracle Corp. databases, and a few of the sites use commercial storage systems, but the hardest part — building the middleware to operate the grid — was left largely to Robertson and his peers.
“It’s surprised me a bit that there haven’t been more commercial tools available to us. What we’re building is not very specialized; we’re just creating a virtual clustered system, but on a large scale and with a very large amount of data,” he said.
Instead, CERN based its grid on the Globus Toolkit from the Globus Alliance, adding scheduling software from the University of Wisconsin’s Condor project and tools developed in Italy under the European Union’s DataGrid project. “This stuff comes from a lot of different places, it’s very much a component-based approach,” Robertson said.
The middleware serves two main functions. One is to make the grid look as much as possible to users and applications as a single large computer. “The other side is how you operate it — there’s a lot of work being done on monitoring software that lets you see what’s actually happening, how things are behaving. This is very early days for the operations side of it,” Robertson said.
One reason there are few commercial tools that are useful to CERN may be that what the big vendors are peddling as grid computing is not really grid computing at all — at least, not the way CERN defines it.
“For us, grid computing is a way of interconnecting computing capacity across multiple sites that lets everyone get access to the capacity when they need it, and where you really do send your work to where the data is and where you can move the data from one site to another,” Robertson said.
The commercial offerings have so far been geared more toward building grids within enterprises, he said, for which clusters may be a more accurate term.
“On the commercial side there’s more interest in Web services, and even there we thought there’d be some basic things coming out that we could use, but so far there are not.”
Mike Bernhardt, chief executive at consulting company Grid Strategies Inc., was not surprised. As in the past, CERN’s grid project is pushing the boundaries of computing, while many commercial products are just now being developed, with primarily enterprise computing in mind.
“To speak in very general terms, there’s too much of what CERN doesn’t need, and not enough of what they do need for their specific purposes. I would suggest that parts of what CERN develops now will find its way into those commercial applications in the not too distant future,” he wrote in an email response to questions.
Software is only one part of CERN’s challenge. It must also come up with an operational framework which ensures that data and computing resources are available when needed, but which also leaves the various institutions free to run their other projects and applications.
“It’s technological, but it’s also sociological,” Robertson said. “The grid has got to be democratic, you can’t have someone in the middle like a dictator.”
CERN used mainframes to crunch data from its previous particle accelerator, then switched to small clusters of RISC machines. Today, PCs running Linux provide the most economical system, but employing thousands of computers at hundreds of sites also adds complexity.
The grid is already being used to test a few LHC applications, but currently it has only around 5 percent of the processing capacity it will eventually need. CERN expects to reach that goal by adding new sites, new resources at existing sites and through the continued evolution of processor speed and disk storage capacity.
Meanwhile, work on the collider continues. Two weeks ago, the first of the giant “dipole” magnets that will steer the photon beams around the collider was lowered into position. Each magnet is 15 meters long and weighs 35 tons, and there will be more than 1,200 of them, as well as many smaller magnets.
While much work on the grid remains, Robertson, a 30-year veteran at the labs, seemed quietly confident that it will be ready on time. “We’ve a long history of having to deal with big scientific developments, with big increases in computing,” Robertson said. “With the last accelerator we didn’t have enough computing power. We were using mainframes and we had to jump in very quickly with RISC clusters before most people had thought about doing that.
“Sometimes you have to take a bit of a risk, but you do it in the most simple way that you can.”
CERN is the world’s biggest particle physics center, devoted to helping scientists figure out what comprises matter and what holds it together. The acronym originally stood for Conseil Europ